SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label Examples

Label	Examples
1	'Evaluation:\nThe answer is highly relevant and aligns well with the document. It explains the steps to resolve the "Couldn't Fetch" error in the sitemap, which impacts the discovery of pages. The steps provided, such as checking the sitemap URL path, resubmitting the URL if incorrect, and using the inspection tool, are directly grounded in the document provided. There are no discrepancies, and the key points are covered comprehensively.\n\nThe final evaluation:' 'Reasoning:\nThe answer provided is clear and well-structured, capturing the steps needed to enable clients to book multiple participants for a service in `<ORGANIZATION>` Bookings. However, there is a discrepancy in step 6 where "John Youngimum" is mentioned instead of "Maximum". This typographical error could potentially confuse readers.\n\nFinal result:' 'Evaluation:\nThe provided answer does address the question, giving a specific response related to the error encountered while changing the location for booking services. It mentions a known issue and asserts that it has been resolved. However, the answer is not grounded in the provided document, which does not contain any information about booking services or the specific issue mentioned.\n\nFinal evaluation:'
0	"The answer given is not completely accurate based on the provided document. The document mentions that you cannot transfer your Bookings App from one site to another, but it does not explicitly state that you cannot update the booking app on your site. The answer's suggestion to vote for future features is correct but incomplete and misleading in the context of the user's question about updating the app.\n\nFinal Evaluation:" 'The answer is comprehensive and well-organized, clearly detailing the steps required to add a service, including additional details for setting up a service page for site members only. The response accurately reflects the structure and content of the provided document. Each step corresponds to instructions within the document, ensuring all critical aspects are covered.\n\nHowever, there is a slight issue in the clarity due to minor typographical errors ("youre" instead of "you're"). While these don't detract significantly, they do impact the overall professionalism.\n\nFinal evaluation:' 'The answer covers the steps to display blog categories on the blog feed as described in the document, following a logical sequence that mirrors the outlined procedure. However, it fails to adequately articulate one specific point, using the placeholder "95593638" instead of the correct term "create." This error is repeated multiple times, which could cause confusion and render the instructions largely unusable.\n\nHence, while the general content is correct, the improper terminology application significantly reduces its effectiveness and usability.\n\nFinal Evaluation:'

'Evaluation:\nThe answer is highly relevant and aligns well with the document. It explains the steps to resolve the "Couldn't Fetch" error in the sitemap, which impacts the discovery of pages. The steps provided, such as checking the sitemap URL path, resubmitting the URL if incorrect, and using the inspection tool, are directly grounded in the document provided. There are no discrepancies, and the key points are covered comprehensively.\n\nThe final evaluation:'
'Reasoning:\nThe answer provided is clear and well-structured, capturing the steps needed to enable clients to book multiple participants for a service in <ORGANIZATION> Bookings. However, there is a discrepancy in step 6 where "John Youngimum" is mentioned instead of "Maximum". This typographical error could potentially confuse readers.\n\nFinal result:'
'Evaluation:\nThe provided answer does address the question, giving a specific response related to the error encountered while changing the location for booking services. It mentions a known issue and asserts that it has been resolved. However, the answer is not grounded in the provided document, which does not contain any information about booking services or the specific issue mentioned.\n\nFinal evaluation:'

"The answer given is not completely accurate based on the provided document. The document mentions that you cannot transfer your Bookings App from one site to another, but it does not explicitly state that you cannot update the booking app on your site. The answer's suggestion to vote for future features is correct but incomplete and misleading in the context of the user's question about updating the app.\n\nFinal Evaluation:"
'The answer is comprehensive and well-organized, clearly detailing the steps required to add a service, including additional details for setting up a service page for site members only. The response accurately reflects the structure and content of the provided document. Each step corresponds to instructions within the document, ensuring all critical aspects are covered.\n\nHowever, there is a slight issue in the clarity due to minor typographical errors ("youre" instead of "you're"). While these don't detract significantly, they do impact the overall professionalism.\n\nFinal evaluation:'
'The answer covers the steps to display blog categories on the blog feed as described in the document, following a logical sequence that mirrors the outlined procedure. However, it fails to adequately articulate one specific point, using the placeholder "95593638" instead of the correct term "create." This error is repeated multiple times, which could cause confusion and render the instructions largely unusable.\n\nHence, while the general content is correct, the improper terminology application significantly reduces its effectiveness and usability.\n\nFinal Evaluation:'

Evaluation

Metrics

Label	Accuracy
all	0.6667

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_wix_qa_gpt-4o_cot-few_shot_remove_final_evaluation_e1_1726759073.27929")
# Run inference
preds = model("Evaluation:
The answer is directly grounded in the document provided. It clearly and correctly outlines the steps to change the reservation reference from the service page to the booking calendar. The steps in the document and the answer align perfectly.

The final evaluation:")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	30	82.6444	209

Label	Training Sample Count
0	22
1	23

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0088	1	0.202	-
0.4425	50	0.2406	-
0.8850	100	0.1459	-

Framework Versions

Python: 3.10.14
SetFit: 1.1.0
Sentence Transformers: 3.1.0
Transformers: 4.44.0
PyTorch: 2.4.1+cu121
Datasets: 2.19.2
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Netta1994
/

setfit_baai_wix_qa_gpt-4o_cot-few_shot_remove_final_evaluation_e1_1726759073.27929