SetFit with intfloat/multilingual-e5-large

This is a SetFit model that can be used for Text Classification. This SetFit model uses intfloat/multilingual-e5-large as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: intfloat/multilingual-e5-large
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 12 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
0	'what are the top brands contributing to share gain for Jumex in Cuernavaca in 2022' 'Apart from Jugos + NÃ©ctares, Which are the top contributing categoriesXconsumo to the share loss for Jumex in Orizaba in 2021?' 'what are the top brands contributing to share gain/loss for KOF in Cuernavaca in2022'
2	"What is the trend of Danone's market share in Colas SS in Cuernavaca from 2019 to YTD 2023?" 'Are there any notable shifts in market share for KOF from 2021 to 2022 in TT OP' 'In which categories KOF has gained most share in TT OP Cuernavaca 2021-2022'
3	'What is the avg pack size for an offering within the 12.1-15 price bracket for Agua in TT HM, for top KOF brand vs Top competitor brand?' 'How should KOF gain share in <10 price bracket for NCB in TT HM' 'What is the price range for CSD in TT HM?'
5	'What are the untapped opportunities in Graffon?' 'Help me with new categories to expand in for kof' 'I am a category manager for agua at kof. Tell me what areas to prioritize for category development'
8	'Which month and at what price was my share highest' 'What is the sku range and velocity of KOF in colas' 'distribution wise, which non csd skus are doing the best?'
11	'Which levers to prioritize to gain share in Orizaba Colas MS_PET_RET?' 'Which levers to prioritize to gain share in CSDS?' 'How can I gain share in NCBS?'
9	'How much headroom do I have in AGUA' 'What measures can be taken to maximize headroom in the AGUA market?' 'Which industries to prioritize to gain share in CSDS in TT HM?'
10	'Which pack segment shows opportunities to drive my market share in CSDs Colas MS?' 'What are my priority pack segments to gain share in AGUA Colas SS?' 'What are my priority pack segments to gain share in NCB Colas SS?'
1	'Which levers have led the share loss of KOF in Colas in Q4' 'Why is Resto losing share in Cuernavaca Colas SS RET Original?' 'What are the main factors contributing to the share gain of Jumex in Still Drinks MS in Orizaba for FY 2022?'
7	'Is there any PPL correction scope for Valle Frut within TT OP?' 'Is there a need for PPL correction in the energy drink offerings of Red Bull within the Energy Drinks category?' 'Is CC a premium brand? How premium are its offerings as compared to other brands in Colas?'
4	'What is the industry mix of CSDS' 'How has the csd industry evolved in the last two years?' 'What is the change in industry mix for coca-cola in TT HM Orizaba in 2021 to 2022'
6	"I'm interested in launching a new orange flavored offering in new york city in the (TT OP) category. What pack sizes would be most suitable for this market?" 'I want to launch a new pack type in csd for kof. Tell me what' 'Within Colas MS, which pack segments are dominated by Red cola in Cuernavaca? Do we have any offerings to compete with the same?'

Evaluation

Metrics

Label	Accuracy
all	0.25

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("vgarg/fw_identification_model_e5_large_v5_14_12_23")
# Run inference
preds = model("Why is KOF losing share in Cuernavaca Colas MS RET Original?")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	5	13.8362	33

Label	Training Sample Count
0	10
1	10
2	10
3	10
4	10
5	10
6	10
7	10
8	10
9	10
10	10
11	6

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (3, 3)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0034	1	0.3504	-
0.1724	50	0.1647	-
0.3448	100	0.0301	-
0.5172	150	0.0113	-
0.6897	200	0.0026	-
0.8621	250	0.0012	-
1.0345	300	0.0006	-
1.2069	350	0.001	-
1.3793	400	0.0007	-
1.5517	450	0.0004	-
1.7241	500	0.0006	-
1.8966	550	0.0005	-
2.0690	600	0.0005	-
2.2414	650	0.0004	-
2.4138	700	0.0003	-
2.5862	750	0.0005	-
2.7586	800	0.0004	-
2.9310	850	0.0003	-

Framework Versions

Python: 3.10.12
SetFit: 1.0.1
Sentence Transformers: 2.2.2
Transformers: 4.35.2
PyTorch: 2.1.0+cu118
Datasets: 2.15.0
Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

vgarg
/

fw_identification_model_e5_large_v5_14_12_23