SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
0	'Evaluation:\nThe answer does not directly address the specific question asked. The document provides details on why considering all the answers together when determining if the behavior in a MalOp is malicious is important, such as assessing the significance of involved machines, behaviors, and users. However, the provided answer is too general and fails to capture these specifics.\n\nFinal evaluation: Bad' 'Reasoning:\n\n1. Context Grounding: The answer does not reference the provided document at all and instead suggests seeking additional sources. The steps to exclude a MalOp during the remediation phase are clearly present in the document.\n2. Relevance: The answer does not address the specific question asked. The question seeks the process to exclude a MalOp, and the provided document contains specific steps to achieve this.\n3. Conciseness: The answer is brief but unhelpfully so, as it completely lacks pertinent information available in the document.\n4. Specifics: The document does contain specific instructions on how to exclude a MalOp, which the answer fails to acknowledge or explain.\n5. Key/Value/Event Name: There are relevant actions and links identified in the document, which are not mentioned in the answer.\n\nFinal Evaluation: Bad.' 'Evaluation:\n\n1. Context Grounding: The provided answer is well-supported by the document, specifically stating that a quarantined file should be un-quarantined before submitting it.\n2. Relevance: The answer is directly related to the question, addressing what should be done with a quarantined file.\n3. Conciseness: The answer is concise and to the point, clearly stating the necessary action.\n4. Specificity: The answer gives a particular instruction relevant to the question.\n5. Key Identification: The answer correctly identifies the key action ("un-quarantine") directly from the document.\n\nFinal result: Good'
1	"Reasoning:\n1. Context Grounding: The answer is directly supported by the document, as it explains what happens after configuring the computer to generate a memory dump file.\n2. Relevance: The answer is relevant to the question asked and addresses it directly.\n3. Conciseness: The answer is concise and to the point.\n4. Specificity: The answer specifically mentions the generation of a dump file containing the entire contents of the sensor's RAM, which is directly pulled from the document.\n\nFinal result: Good" 'Evaluation:\n1. Context Grounding: The answer is grounded in the document, which mentions that the platform uses an advanced engine to identify cyber security threats.\n2. Relevance: The answer directly addresses the question by stating the purpose of the threat detection abilities.\n3. Conciseness: The answer is clear and to the point.\n4. Specificity: The answer correctly identifies the purpose of the threat detection abilities as detailed in the document.\n5. Keys/Values/Events: Not applicable in this scenario.\n\nFinal evaluation: Good' 'Reasoning:\nThe answer provided does not address the specific severity score for the fifth scenario in the document. Instead, it suggests that the document does not cover this query and refers to additional sources, which is incorrect. The document contains information about four scenarios, and there is no fifth scenario mentioned within it. The answer should accurately state that there is no fifth scenario provided in the document.\n\nFinal Result: Bad'

Label

Examples

'Evaluation:\nThe answer does not directly address the specific question asked. The document provides details on why considering all the answers together when determining if the behavior in a MalOp is malicious is important, such as assessing the significance of involved machines, behaviors, and users. However, the provided answer is too general and fails to capture these specifics.\n\nFinal evaluation: Bad'
'Reasoning:\n\n1. Context Grounding: The answer does not reference the provided document at all and instead suggests seeking additional sources. The steps to exclude a MalOp during the remediation phase are clearly present in the document.\n2. Relevance: The answer does not address the specific question asked. The question seeks the process to exclude a MalOp, and the provided document contains specific steps to achieve this.\n3. Conciseness: The answer is brief but unhelpfully so, as it completely lacks pertinent information available in the document.\n4. Specifics: The document does contain specific instructions on how to exclude a MalOp, which the answer fails to acknowledge or explain.\n5. Key/Value/Event Name: There are relevant actions and links identified in the document, which are not mentioned in the answer.\n\nFinal Evaluation: Bad.'
'Evaluation:\n\n1. Context Grounding: The provided answer is well-supported by the document, specifically stating that a quarantined file should be un-quarantined before submitting it.\n2. Relevance: The answer is directly related to the question, addressing what should be done with a quarantined file.\n3. Conciseness: The answer is concise and to the point, clearly stating the necessary action.\n4. Specificity: The answer gives a particular instruction relevant to the question.\n5. Key Identification: The answer correctly identifies the key action ("un-quarantine") directly from the document.\n\nFinal result: Good'

"Reasoning:\n1. Context Grounding: The answer is directly supported by the document, as it explains what happens after configuring the computer to generate a memory dump file.\n2. Relevance: The answer is relevant to the question asked and addresses it directly.\n3. Conciseness: The answer is concise and to the point.\n4. Specificity: The answer specifically mentions the generation of a dump file containing the entire contents of the sensor's RAM, which is directly pulled from the document.\n\nFinal result: Good"
'Evaluation:\n1. Context Grounding: The answer is grounded in the document, which mentions that the platform uses an advanced engine to identify cyber security threats.\n2. Relevance: The answer directly addresses the question by stating the purpose of the threat detection abilities.\n3. Conciseness: The answer is clear and to the point.\n4. Specificity: The answer correctly identifies the purpose of the threat detection abilities as detailed in the document.\n5. Keys/Values/Events: Not applicable in this scenario.\n\nFinal evaluation: Good'
'Reasoning:\nThe answer provided does not address the specific severity score for the fifth scenario in the document. Instead, it suggests that the document does not cover this query and refers to additional sources, which is incorrect. The document contains information about four scenarios, and there is no fifth scenario mentioned within it. The answer should accurately state that there is no fifth scenario provided in the document.\n\nFinal Result: Bad'

Evaluation

Metrics

Label	Accuracy
all	0.5493

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_cybereason_gpt-4o_cot-few_shot-instructions_only_reasoning_1726751890.998")
# Run inference
preds = model("The answer provided directly relates to the question asked and is well-supported by the document, which explains the percentage in the response status column as the total amount of successful completion of response actions. The answer is concise and specific to the query.

Final evaluation: Good")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	19	77.9420	193

Label	Training Sample Count
0	34
1	35

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (5, 5)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0058	1	0.2388	-
0.2890	50	0.2629	-
0.5780	100	0.2313	-
0.8671	150	0.0609	-
1.1561	200	0.0033	-
1.4451	250	0.0024	-
1.7341	300	0.0022	-
2.0231	350	0.0018	-
2.3121	400	0.0018	-
2.6012	450	0.0016	-
2.8902	500	0.0015	-
3.1792	550	0.0014	-
3.4682	600	0.0013	-
3.7572	650	0.0014	-
4.0462	700	0.0014	-
4.3353	750	0.0013	-
4.6243	800	0.0012	-
4.9133	850	0.0012	-

Framework Versions

Python: 3.10.14
SetFit: 1.1.0
Sentence Transformers: 3.1.0
Transformers: 4.44.0
PyTorch: 2.4.1+cu121
Datasets: 2.19.2
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Netta1994
/

setfit_baai_cybereason_gpt-4o_cot-few_shot-instructions_only_reasoning_1726751890.998