metadata
base_model: BAAI/bge-base-en-v1.5
library_name: setfit
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: >-
The answer provided directly relates to the question asked and is
well-supported by the document, which explains the percentage in the
response status column as the total amount of successful completion of
response actions. The answer is concise and specific to the query.
Final evaluation: Good
- text: >-
Evaluation:
The answer states that the provided information does not cover the
specific query, suggesting referring to additional sources or providing
more context. However, the document does cover the process of enabling and
configuring Endpoint controls and mentions specific features under
Endpoint controls like Device Control, Personal Firewall Control, and Full
Disk Encryption Visibility. The document does not explicitly state the
"purpose" of Endpoint controls, but it is evident from the listed features
that these controls are for managing device control, firewall settings,
and disk encryption visibility. Therefore, the answer is not
well-supported by the document and fails to address the specific question
adequately.
Final evaluation: Bad
- text: >-
Reasoning:
1. **Context Grounding**: The answer is supported by the provided document
where it is mentioned that the On-Site Collector Agent collects logs and
forwards them to <ORGANIZATION> XDR.
2. **Relevance**: The purpose of the <ORGANIZATION> XDR On-Site Collector
Agent is indeed to collect and securely forward logs.
3. **Conciseness**: The answer is concise and directly addresses the
specific question asked without unnecessary information.
4. **Specificity**: The answer is specific to the question regarding the
purpose of the On-Site Collector Agent, without being too general.
5. **Key/Value/Event Name**: Although the answer does not include keys or
values from the document, it is not necessary for this specific question
about the purpose of the agent.
The answer meets all the criteria effectively.
Final evaluation: Good
- text: >-
The provided answer does not align well with the document. Here's a
detailed analysis of the evaluation criteria:
1. **Context Grounding**: The answer does not seem to be backed up by the
specifics provided in the document. The document describes settings around
making sensors stale, archived, or deleted and associated email
notifications, but it does not explicitly mention a checkbox for email
notifications in the Users section.
2. **Relevance**: The answer does not correctly address the specific query
about the checkbox in the Users section as per the document content.
3. **Conciseness**: While the answer is concise, it is not directly
supported by the content of the document, making it irrelevant.
4. **Specificity**: The answer lacks specific details or a direct quote
from the document that mentions the Users section checkbox.
5. **Accuracy in Key/Value/Event Name**: The document does not provide
details about a checkbox for email notifications in the Users section,
thus the key/value/event name aspect is also not correctly covered.
Based on these points, the answer provided fails to meet the necessary
criteria.
Final evaluation: **Bad**
- text: >-
**Reasoning**:
1. **Context Grounding**: The answer does not match the context provided
in the document. The document specifies different URLs for images related
to DNS queries and connection queries.
2. **Relevance**: The answer is not relevant to the specific question
asked. The question asks for the URL of the image for the second query,
which is clearly provided in the document but not correctly retrieved in
the answer.
3. **Conciseness**: The answer is concise but incorrect, making it not
useful.
4. **Specificity**: The answer lacks accuracy, which is critical for
answering the specific question. It provides an incorrect URL.
5. **Key, Value, Event Name**: Since the question is about a specific URL,
correctness of the key/value is crucial, which the answer fails to
provide.
**Final evaluation**: Bad
inference: true
model-index:
- name: SetFit with BAAI/bge-base-en-v1.5
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Unknown
type: unknown
split: test
metrics:
- type: accuracy
value: 0.5492957746478874
name: Accuracy
SetFit with BAAI/bge-base-en-v1.5
This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
- Model Type: SetFit
- Sentence Transformer body: BAAI/bge-base-en-v1.5
- Classification head: a LogisticRegression instance
- Maximum Sequence Length: 512 tokens
- Number of Classes: 2 classes
Model Sources
- Repository: SetFit on GitHub
- Paper: Efficient Few-Shot Learning Without Prompts
- Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts
Model Labels
Label | Examples |
---|---|
0 |
|
1 |
|
Evaluation
Metrics
Label | Accuracy |
---|---|
all | 0.5493 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_cybereason_gpt-4o_cot-few_shot-instructions_only_reasoning_1726751890.998")
# Run inference
preds = model("The answer provided directly relates to the question asked and is well-supported by the document, which explains the percentage in the response status column as the total amount of successful completion of response actions. The answer is concise and specific to the query.
Final evaluation: Good")
Training Details
Training Set Metrics
Training set | Min | Median | Max |
---|---|---|---|
Word count | 19 | 77.9420 | 193 |
Label | Training Sample Count |
---|---|
0 | 34 |
1 | 35 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (5, 5)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.0058 | 1 | 0.2388 | - |
0.2890 | 50 | 0.2629 | - |
0.5780 | 100 | 0.2313 | - |
0.8671 | 150 | 0.0609 | - |
1.1561 | 200 | 0.0033 | - |
1.4451 | 250 | 0.0024 | - |
1.7341 | 300 | 0.0022 | - |
2.0231 | 350 | 0.0018 | - |
2.3121 | 400 | 0.0018 | - |
2.6012 | 450 | 0.0016 | - |
2.8902 | 500 | 0.0015 | - |
3.1792 | 550 | 0.0014 | - |
3.4682 | 600 | 0.0013 | - |
3.7572 | 650 | 0.0014 | - |
4.0462 | 700 | 0.0014 | - |
4.3353 | 750 | 0.0013 | - |
4.6243 | 800 | 0.0012 | - |
4.9133 | 850 | 0.0012 | - |
Framework Versions
- Python: 3.10.14
- SetFit: 1.1.0
- Sentence Transformers: 3.1.0
- Transformers: 4.44.0
- PyTorch: 2.4.1+cu121
- Datasets: 2.19.2
- Tokenizers: 0.19.1
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}