omymble commited on
Commit
1e87569
1 Parent(s): 833688f

Push model using huggingface_hub.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,220 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: sentence-transformers/paraphrase-mpnet-base-v2
3
+ datasets:
4
+ - omymble/setfit-books-categories
5
+ library_name: setfit
6
+ metrics:
7
+ - accuracy
8
+ pipeline_tag: text-classification
9
+ tags:
10
+ - setfit
11
+ - absa
12
+ - sentence-transformers
13
+ - text-classification
14
+ - generated_from_setfit_trainer
15
+ widget:
16
+ - text: His fantasy works are not cliché or based on traditional fantasy but they
17
+ are full of fresh, imagination and worlds and characters we can learn to love
18
+ - text: I found this a good book from a good author
19
+ - text: This is dark fantasy at its best
20
+ - text: Mister Monday is an interesting Fantasy novel that draws readers in from the
21
+ very beginning
22
+ - text: I found this a good book from a good author
23
+ inference: false
24
+ ---
25
+
26
+ # SetFit Polarity Model with sentence-transformers/paraphrase-mpnet-base-v2
27
+
28
+ This is a [SetFit](https://github.com/huggingface/setfit) model trained on the [omymble/setfit-books-categories](https://huggingface.co/datasets/omymble/setfit-books-categories) dataset that can be used for Aspect Based Sentiment Analysis (ABSA). This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. In particular, this model is in charge of classifying aspect polarities.
29
+
30
+ The model has been trained using an efficient few-shot learning technique that involves:
31
+
32
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
33
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
34
+
35
+ This model was trained within the context of a larger system for ABSA, which looks like so:
36
+
37
+ 1. Use a spaCy model to select possible aspect span candidates.
38
+ 2. Use a SetFit model to filter these possible aspect span candidates.
39
+ 3. **Use this SetFit model to classify the filtered aspect span candidates.**
40
+
41
+ ## Model Details
42
+
43
+ ### Model Description
44
+ - **Model Type:** SetFit
45
+ - **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
46
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
47
+ - **spaCy Model:** en_core_web_lg
48
+ - **SetFitABSA Aspect Model:** [setfit-absa-aspect](https://huggingface.co/setfit-absa-aspect)
49
+ - **SetFitABSA Polarity Model:** [omymble/books-categories](https://huggingface.co/omymble/books-categories)
50
+ - **Maximum Sequence Length:** 512 tokens
51
+ - **Number of Classes:** 6 classes
52
+ - **Training Dataset:** [omymble/setfit-books-categories](https://huggingface.co/datasets/omymble/setfit-books-categories)
53
+ <!-- - **Language:** Unknown -->
54
+ <!-- - **License:** Unknown -->
55
+
56
+ ### Model Sources
57
+
58
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
59
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
60
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
61
+
62
+ ### Model Labels
63
+ | Label | Examples |
64
+ |:-------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
65
+ | BOOK#AUDIENCE | <ul><li>'I recommend this for fans of fantasy, or other books by Garth Nix'</li><li>'I first got this book when I was eight and I totally loved it! I have read it every year since then! It is about a pair of twins who are born one VERY good and one EXTREMELY bad'</li><li>'However, I did feel one particular scene might be rather nightmare-inducing for the youngest readers - so recommend this for the ages of 12 and above'</li></ul> |
66
+ | BOOK#AUTHOR | <ul><li>'Banks has writin better books than this book,'</li><li>"Now in is astonishing new novel, Michael Dobbs throws brilliant fresh light upon Churchill's relationship with the Soviet spy and the twenty months of conspiracy, chance and outright treachery that were to propel Churchill from outcast to messiah and change the course of history"</li><li>'Paul focuses on the problems of an intimate relationship and the decisions the teens make at that moment'</li></ul> |
67
+ | BOOK#GENERAL | <ul><li>'This is the first book in the Keys to the Kingdom series by Garth Nix'</li><li>'The book is a great read right until the end, so rare in non-fiction'</li><li>'Anne Kingston did a marvellous job on this book'</li></ul> |
68
+ | BOOK#TITLE | <ul><li>'Personal I loved My Darling My Hamburger'</li><li>'But THE INTRUDERS is pretty much a middling effort, at least when it comes to the plot'</li><li>'After reading several pages I relented and purchased Mister Monday'</li></ul> |
69
+ | CONTENT#CHARACTERS | <ul><li>"She's not a great writer but she's a fabulous storyteller and her Tony Hill/Carol Jordan mysteries are the best of the bunch"</li><li>'but before he can do that he has to dodge fechters, run from enemys like Noon and Dawn, run from dinosaurs, try not to get killed, and try to prevent himself from having a asthma atackk!! But, thankfully he has some help from a girl named suzy, a guy named Dusk, and a talking toad'</li><li>'But when a fight emerges between the two figures - Mister Monday and Sneezer - they both disappear without any further regard to Arthur'</li></ul> |
70
+ | CONTENT#GENRE | <ul><li>'I love fantasy and science fiction, but this storyteller forgot something very important'</li><li>'At first I was amused an entertained by Angela and Diabola the novel by Lynne Reid Banks, but as it progressed and became exceedingly darker, I read the jacket to find that this book was recommended for ages 9-12'</li><li>"Here's a thriller that really thrills"</li></ul> |
71
+
72
+ ## Uses
73
+
74
+ ### Direct Use for Inference
75
+
76
+ First install the SetFit library:
77
+
78
+ ```bash
79
+ pip install setfit
80
+ ```
81
+
82
+ Then you can load this model and run inference.
83
+
84
+ ```python
85
+ from setfit import AbsaModel
86
+
87
+ # Download from the 🤗 Hub
88
+ model = AbsaModel.from_pretrained(
89
+ "setfit-absa-aspect",
90
+ "omymble/books-categories",
91
+ )
92
+ # Run inference
93
+ preds = model("The food was great, but the venue is just way too busy.")
94
+ ```
95
+
96
+ <!--
97
+ ### Downstream Use
98
+
99
+ *List how someone could finetune this model on their own dataset.*
100
+ -->
101
+
102
+ <!--
103
+ ### Out-of-Scope Use
104
+
105
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
106
+ -->
107
+
108
+ <!--
109
+ ## Bias, Risks and Limitations
110
+
111
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
112
+ -->
113
+
114
+ <!--
115
+ ### Recommendations
116
+
117
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
118
+ -->
119
+
120
+ ## Training Details
121
+
122
+ ### Training Set Metrics
123
+ | Training set | Min | Median | Max |
124
+ |:-------------|:----|:--------|:----|
125
+ | Word count | 2 | 21.0917 | 78 |
126
+
127
+ | Label | Training Sample Count |
128
+ |:-------------------|:----------------------|
129
+ | BOOK#AUDIENCE | 20 |
130
+ | BOOK#AUTHOR | 20 |
131
+ | BOOK#GENERAL | 20 |
132
+ | BOOK#TITLE | 20 |
133
+ | CONTENT#CHARACTERS | 20 |
134
+ | CONTENT#GENRE | 20 |
135
+
136
+ ### Training Hyperparameters
137
+ - batch_size: (128, 128)
138
+ - num_epochs: (5, 5)
139
+ - max_steps: -1
140
+ - sampling_strategy: oversampling
141
+ - body_learning_rate: (2e-05, 1e-05)
142
+ - head_learning_rate: 0.01
143
+ - loss: CosineSimilarityLoss
144
+ - distance_metric: cosine_distance
145
+ - margin: 0.25
146
+ - end_to_end: False
147
+ - use_amp: True
148
+ - warmup_proportion: 0.1
149
+ - seed: 42
150
+ - eval_max_steps: -1
151
+ - load_best_model_at_end: True
152
+
153
+ ### Training Results
154
+ | Epoch | Step | Training Loss | Validation Loss |
155
+ |:----------:|:-------:|:-------------:|:---------------:|
156
+ | 0.0106 | 1 | 0.2623 | - |
157
+ | 0.5319 | 50 | 0.1293 | - |
158
+ | 1.0638 | 100 | 0.0132 | - |
159
+ | 1.5957 | 150 | 0.0022 | - |
160
+ | 2.1277 | 200 | 0.0027 | - |
161
+ | 2.6596 | 250 | 0.0013 | - |
162
+ | **3.1915** | **300** | **0.0017** | **-** |
163
+ | 3.7234 | 350 | 0.0015 | - |
164
+ | 4.2553 | 400 | 0.0029 | - |
165
+ | 4.7872 | 450 | 0.0015 | - |
166
+ | 0.0106 | 1 | 0.0115 | - |
167
+ | 0.5319 | 50 | 0.009 | 0.1324 |
168
+ | 1.0638 | 100 | 0.0094 | 0.1267 |
169
+ | 1.5957 | 150 | 0.0007 | 0.1194 |
170
+ | 2.1277 | 200 | 0.0017 | 0.1256 |
171
+ | 2.6596 | 250 | 0.0008 | 0.1293 |
172
+ | **3.1915** | **300** | **0.0007** | **0.1173** |
173
+ | 3.7234 | 350 | 0.0008 | 0.1231 |
174
+ | 4.2553 | 400 | 0.0023 | 0.1272 |
175
+ | 4.7872 | 450 | 0.0008 | 0.1241 |
176
+
177
+ * The bold row denotes the saved checkpoint.
178
+ ### Framework Versions
179
+ - Python: 3.10.12
180
+ - SetFit: 1.0.3
181
+ - Sentence Transformers: 3.1.0
182
+ - spaCy: 3.7.4
183
+ - Transformers: 4.39.0
184
+ - PyTorch: 2.3.1+cu121
185
+ - Datasets: 2.20.0
186
+ - Tokenizers: 0.15.2
187
+
188
+ ## Citation
189
+
190
+ ### BibTeX
191
+ ```bibtex
192
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
193
+ doi = {10.48550/ARXIV.2209.11055},
194
+ url = {https://arxiv.org/abs/2209.11055},
195
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
196
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
197
+ title = {Efficient Few-Shot Learning Without Prompts},
198
+ publisher = {arXiv},
199
+ year = {2022},
200
+ copyright = {Creative Commons Attribution 4.0 International}
201
+ }
202
+ ```
203
+
204
+ <!--
205
+ ## Glossary
206
+
207
+ *Clearly define terms in order to be accessible across audiences.*
208
+ -->
209
+
210
+ <!--
211
+ ## Model Card Authors
212
+
213
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
214
+ -->
215
+
216
+ <!--
217
+ ## Model Card Contact
218
+
219
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
220
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "models/cats/step_300",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.39.0",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.0",
4
+ "transformers": "4.39.0",
5
+ "pytorch": "2.3.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "spacy_model": "en_core_web_lg",
3
+ "normalize_embeddings": false,
4
+ "span_context": 3,
5
+ "labels": [
6
+ "BOOK#AUDIENCE",
7
+ "BOOK#AUTHOR",
8
+ "BOOK#GENERAL",
9
+ "BOOK#TITLE",
10
+ "CONTENT#CHARACTERS",
11
+ "CONTENT#GENRE"
12
+ ]
13
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44259572150ceb4e6b8c9a536fd107c60fb6ad0aefd25d4506a8e8c550da762e
3
+ size 437967672
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:52b308223f347250fa3ed71a29fc076d612dd0bb5102a7e3b2f68fce733da3c0
3
+ size 38183
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "104": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "30526": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": true,
46
+ "cls_token": "<s>",
47
+ "do_basic_tokenize": true,
48
+ "do_lower_case": true,
49
+ "eos_token": "</s>",
50
+ "mask_token": "<mask>",
51
+ "max_length": 512,
52
+ "model_max_length": 512,
53
+ "never_split": null,
54
+ "pad_to_multiple_of": null,
55
+ "pad_token": "<pad>",
56
+ "pad_token_type_id": 0,
57
+ "padding_side": "right",
58
+ "sep_token": "</s>",
59
+ "stride": 0,
60
+ "strip_accents": null,
61
+ "tokenize_chinese_chars": true,
62
+ "tokenizer_class": "MPNetTokenizer",
63
+ "truncation_side": "right",
64
+ "truncation_strategy": "longest_first",
65
+ "unk_token": "[UNK]"
66
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff