rbhatia46 commited on
Commit
ab14733
1 Parent(s): eaa252e

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 1024,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,879 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: WhereIsAI/UAE-Large-V1
3
+ datasets: []
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:3474
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: Microsoft Corporation believes that its success is based upon its
35
+ ability to transform to meet the needs of customers. Its growth strategy includes
36
+ innovation across its cloud platforms and services, as well as investing in complementary
37
+ businesses, products, services, and technologies to extend and grow its product
38
+ offerings.
39
+ sentences:
40
+ - What factors caused the surge in Tesla’s stock prices in the first half of 2023?
41
+ - What's Microsoft growth strategy in the cloud computing sector?
42
+ - How has Microsoft Corporation performed in terms of stock prices over the past
43
+ five years?
44
+ - source_sentence: Amazon reported the Q3 2023 earnings revealing a 21% year-over-year
45
+ increase in the revenue, which stood at $116.38 billion. Net income increased
46
+ 57% to $6.66 billion, or $13.21 per diluted share, compared to $4.23 billion,
47
+ or $8.42 per diluted share, in third quarter 2022. Amazon Web Services (AWS) revenue
48
+ grew 32% in the quarter to $15 billion.
49
+ sentences:
50
+ - Can you tell about Amazon's Q3 2023 earnings?
51
+ - What was the net income of Microsoft in Fiscal Year 2024?
52
+ - What is the significance of EBITDA in financial analysis?
53
+ - source_sentence: For the fiscal year 2024, Walmart had an operating profit margin
54
+ of 20%.
55
+ sentences:
56
+ - What is Pfizer's dividend yield for the financial year 2022?
57
+ - What was Exxon Mobil Corporation's net income for the fourth quarter of 2023?
58
+ - What is the operating profit margin for Walmart for the fiscal year 2024?
59
+ - source_sentence: The slowdown in construction, particularly in developing markets,
60
+ resulted in a decrease in demand for Caterpillar's machinery and equipment, which
61
+ negatively impacted the revenue for the year 2022.
62
+ sentences:
63
+ - How did the slow down in construction in 2022 affect Caterpillar's revenues?
64
+ - What is JP Morgan's strategy when it comes to sustainability?
65
+ - What was the debt-to-equity ratio for Tesla Inc in Q4 of 2022?
66
+ - source_sentence: According to Johnson & Johnson’s 2024 guidance report, their pharmaceutical
67
+ sector was projected to grow by 7% in 2023 after considering crucial factors like
68
+ the overall market demand, introduction of new drugs and potential impact of patent
69
+ expirations.
70
+ sentences:
71
+ - What are Caterpillar's initiatives for enhancing its product sustainability?
72
+ - How is JPMorgan Chase & Co. improving its cybersecurity measures?
73
+ - What was the projected growth of Johnson & Johnson’s pharmaceutical sector in
74
+ 2023?
75
+ model-index:
76
+ - name: UAE-Large-V1-financial-embeddings-matryoshka
77
+ results:
78
+ - task:
79
+ type: information-retrieval
80
+ name: Information Retrieval
81
+ dataset:
82
+ name: dim 1024
83
+ type: dim_1024
84
+ metrics:
85
+ - type: cosine_accuracy@1
86
+ value: 0.8316062176165803
87
+ name: Cosine Accuracy@1
88
+ - type: cosine_accuracy@3
89
+ value: 0.9326424870466321
90
+ name: Cosine Accuracy@3
91
+ - type: cosine_accuracy@5
92
+ value: 0.966321243523316
93
+ name: Cosine Accuracy@5
94
+ - type: cosine_accuracy@10
95
+ value: 0.9896373056994818
96
+ name: Cosine Accuracy@10
97
+ - type: cosine_precision@1
98
+ value: 0.8316062176165803
99
+ name: Cosine Precision@1
100
+ - type: cosine_precision@3
101
+ value: 0.31088082901554404
102
+ name: Cosine Precision@3
103
+ - type: cosine_precision@5
104
+ value: 0.1932642487046632
105
+ name: Cosine Precision@5
106
+ - type: cosine_precision@10
107
+ value: 0.09896373056994817
108
+ name: Cosine Precision@10
109
+ - type: cosine_recall@1
110
+ value: 0.8316062176165803
111
+ name: Cosine Recall@1
112
+ - type: cosine_recall@3
113
+ value: 0.9326424870466321
114
+ name: Cosine Recall@3
115
+ - type: cosine_recall@5
116
+ value: 0.966321243523316
117
+ name: Cosine Recall@5
118
+ - type: cosine_recall@10
119
+ value: 0.9896373056994818
120
+ name: Cosine Recall@10
121
+ - type: cosine_ndcg@10
122
+ value: 0.9113990251008172
123
+ name: Cosine Ndcg@10
124
+ - type: cosine_mrr@10
125
+ value: 0.8860854099843737
126
+ name: Cosine Mrr@10
127
+ - type: cosine_map@100
128
+ value: 0.886565872062324
129
+ name: Cosine Map@100
130
+ - task:
131
+ type: information-retrieval
132
+ name: Information Retrieval
133
+ dataset:
134
+ name: dim 768
135
+ type: dim_768
136
+ metrics:
137
+ - type: cosine_accuracy@1
138
+ value: 0.8290155440414507
139
+ name: Cosine Accuracy@1
140
+ - type: cosine_accuracy@3
141
+ value: 0.9326424870466321
142
+ name: Cosine Accuracy@3
143
+ - type: cosine_accuracy@5
144
+ value: 0.966321243523316
145
+ name: Cosine Accuracy@5
146
+ - type: cosine_accuracy@10
147
+ value: 0.9844559585492227
148
+ name: Cosine Accuracy@10
149
+ - type: cosine_precision@1
150
+ value: 0.8290155440414507
151
+ name: Cosine Precision@1
152
+ - type: cosine_precision@3
153
+ value: 0.31088082901554404
154
+ name: Cosine Precision@3
155
+ - type: cosine_precision@5
156
+ value: 0.1932642487046632
157
+ name: Cosine Precision@5
158
+ - type: cosine_precision@10
159
+ value: 0.09844559585492228
160
+ name: Cosine Precision@10
161
+ - type: cosine_recall@1
162
+ value: 0.8290155440414507
163
+ name: Cosine Recall@1
164
+ - type: cosine_recall@3
165
+ value: 0.9326424870466321
166
+ name: Cosine Recall@3
167
+ - type: cosine_recall@5
168
+ value: 0.966321243523316
169
+ name: Cosine Recall@5
170
+ - type: cosine_recall@10
171
+ value: 0.9844559585492227
172
+ name: Cosine Recall@10
173
+ - type: cosine_ndcg@10
174
+ value: 0.9098442107332023
175
+ name: Cosine Ndcg@10
176
+ - type: cosine_mrr@10
177
+ value: 0.8854439098610082
178
+ name: Cosine Mrr@10
179
+ - type: cosine_map@100
180
+ value: 0.8863342112694444
181
+ name: Cosine Map@100
182
+ - task:
183
+ type: information-retrieval
184
+ name: Information Retrieval
185
+ dataset:
186
+ name: dim 512
187
+ type: dim_512
188
+ metrics:
189
+ - type: cosine_accuracy@1
190
+ value: 0.8238341968911918
191
+ name: Cosine Accuracy@1
192
+ - type: cosine_accuracy@3
193
+ value: 0.9378238341968912
194
+ name: Cosine Accuracy@3
195
+ - type: cosine_accuracy@5
196
+ value: 0.9637305699481865
197
+ name: Cosine Accuracy@5
198
+ - type: cosine_accuracy@10
199
+ value: 0.9844559585492227
200
+ name: Cosine Accuracy@10
201
+ - type: cosine_precision@1
202
+ value: 0.8238341968911918
203
+ name: Cosine Precision@1
204
+ - type: cosine_precision@3
205
+ value: 0.3126079447322971
206
+ name: Cosine Precision@3
207
+ - type: cosine_precision@5
208
+ value: 0.19274611398963729
209
+ name: Cosine Precision@5
210
+ - type: cosine_precision@10
211
+ value: 0.09844559585492228
212
+ name: Cosine Precision@10
213
+ - type: cosine_recall@1
214
+ value: 0.8238341968911918
215
+ name: Cosine Recall@1
216
+ - type: cosine_recall@3
217
+ value: 0.9378238341968912
218
+ name: Cosine Recall@3
219
+ - type: cosine_recall@5
220
+ value: 0.9637305699481865
221
+ name: Cosine Recall@5
222
+ - type: cosine_recall@10
223
+ value: 0.9844559585492227
224
+ name: Cosine Recall@10
225
+ - type: cosine_ndcg@10
226
+ value: 0.9085199240883707
227
+ name: Cosine Ndcg@10
228
+ - type: cosine_mrr@10
229
+ value: 0.8836016530964717
230
+ name: Cosine Mrr@10
231
+ - type: cosine_map@100
232
+ value: 0.8844289493397997
233
+ name: Cosine Map@100
234
+ - task:
235
+ type: information-retrieval
236
+ name: Information Retrieval
237
+ dataset:
238
+ name: dim 256
239
+ type: dim_256
240
+ metrics:
241
+ - type: cosine_accuracy@1
242
+ value: 0.8212435233160622
243
+ name: Cosine Accuracy@1
244
+ - type: cosine_accuracy@3
245
+ value: 0.9326424870466321
246
+ name: Cosine Accuracy@3
247
+ - type: cosine_accuracy@5
248
+ value: 0.961139896373057
249
+ name: Cosine Accuracy@5
250
+ - type: cosine_accuracy@10
251
+ value: 0.9792746113989638
252
+ name: Cosine Accuracy@10
253
+ - type: cosine_precision@1
254
+ value: 0.8212435233160622
255
+ name: Cosine Precision@1
256
+ - type: cosine_precision@3
257
+ value: 0.31088082901554404
258
+ name: Cosine Precision@3
259
+ - type: cosine_precision@5
260
+ value: 0.19222797927461138
261
+ name: Cosine Precision@5
262
+ - type: cosine_precision@10
263
+ value: 0.09792746113989637
264
+ name: Cosine Precision@10
265
+ - type: cosine_recall@1
266
+ value: 0.8212435233160622
267
+ name: Cosine Recall@1
268
+ - type: cosine_recall@3
269
+ value: 0.9326424870466321
270
+ name: Cosine Recall@3
271
+ - type: cosine_recall@5
272
+ value: 0.961139896373057
273
+ name: Cosine Recall@5
274
+ - type: cosine_recall@10
275
+ value: 0.9792746113989638
276
+ name: Cosine Recall@10
277
+ - type: cosine_ndcg@10
278
+ value: 0.9050964679750835
279
+ name: Cosine Ndcg@10
280
+ - type: cosine_mrr@10
281
+ value: 0.8807097623159799
282
+ name: Cosine Mrr@10
283
+ - type: cosine_map@100
284
+ value: 0.8817273654804927
285
+ name: Cosine Map@100
286
+ - task:
287
+ type: information-retrieval
288
+ name: Information Retrieval
289
+ dataset:
290
+ name: dim 128
291
+ type: dim_128
292
+ metrics:
293
+ - type: cosine_accuracy@1
294
+ value: 0.8186528497409327
295
+ name: Cosine Accuracy@1
296
+ - type: cosine_accuracy@3
297
+ value: 0.9352331606217616
298
+ name: Cosine Accuracy@3
299
+ - type: cosine_accuracy@5
300
+ value: 0.961139896373057
301
+ name: Cosine Accuracy@5
302
+ - type: cosine_accuracy@10
303
+ value: 0.9792746113989638
304
+ name: Cosine Accuracy@10
305
+ - type: cosine_precision@1
306
+ value: 0.8186528497409327
307
+ name: Cosine Precision@1
308
+ - type: cosine_precision@3
309
+ value: 0.3117443868739206
310
+ name: Cosine Precision@3
311
+ - type: cosine_precision@5
312
+ value: 0.19222797927461138
313
+ name: Cosine Precision@5
314
+ - type: cosine_precision@10
315
+ value: 0.09792746113989637
316
+ name: Cosine Precision@10
317
+ - type: cosine_recall@1
318
+ value: 0.8186528497409327
319
+ name: Cosine Recall@1
320
+ - type: cosine_recall@3
321
+ value: 0.9352331606217616
322
+ name: Cosine Recall@3
323
+ - type: cosine_recall@5
324
+ value: 0.961139896373057
325
+ name: Cosine Recall@5
326
+ - type: cosine_recall@10
327
+ value: 0.9792746113989638
328
+ name: Cosine Recall@10
329
+ - type: cosine_ndcg@10
330
+ value: 0.9031436826413919
331
+ name: Cosine Ndcg@10
332
+ - type: cosine_mrr@10
333
+ value: 0.8781797433999506
334
+ name: Cosine Mrr@10
335
+ - type: cosine_map@100
336
+ value: 0.8793080516202277
337
+ name: Cosine Map@100
338
+ - task:
339
+ type: information-retrieval
340
+ name: Information Retrieval
341
+ dataset:
342
+ name: dim 64
343
+ type: dim_64
344
+ metrics:
345
+ - type: cosine_accuracy@1
346
+ value: 0.7979274611398963
347
+ name: Cosine Accuracy@1
348
+ - type: cosine_accuracy@3
349
+ value: 0.9222797927461139
350
+ name: Cosine Accuracy@3
351
+ - type: cosine_accuracy@5
352
+ value: 0.9585492227979274
353
+ name: Cosine Accuracy@5
354
+ - type: cosine_accuracy@10
355
+ value: 0.9792746113989638
356
+ name: Cosine Accuracy@10
357
+ - type: cosine_precision@1
358
+ value: 0.7979274611398963
359
+ name: Cosine Precision@1
360
+ - type: cosine_precision@3
361
+ value: 0.307426597582038
362
+ name: Cosine Precision@3
363
+ - type: cosine_precision@5
364
+ value: 0.19170984455958548
365
+ name: Cosine Precision@5
366
+ - type: cosine_precision@10
367
+ value: 0.09792746113989637
368
+ name: Cosine Precision@10
369
+ - type: cosine_recall@1
370
+ value: 0.7979274611398963
371
+ name: Cosine Recall@1
372
+ - type: cosine_recall@3
373
+ value: 0.9222797927461139
374
+ name: Cosine Recall@3
375
+ - type: cosine_recall@5
376
+ value: 0.9585492227979274
377
+ name: Cosine Recall@5
378
+ - type: cosine_recall@10
379
+ value: 0.9792746113989638
380
+ name: Cosine Recall@10
381
+ - type: cosine_ndcg@10
382
+ value: 0.8935743388819871
383
+ name: Cosine Ndcg@10
384
+ - type: cosine_mrr@10
385
+ value: 0.8654926391973025
386
+ name: Cosine Mrr@10
387
+ - type: cosine_map@100
388
+ value: 0.8667278930244052
389
+ name: Cosine Map@100
390
+ ---
391
+
392
+ # UAE-Large-V1-financial-embeddings-matryoshka
393
+
394
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [WhereIsAI/UAE-Large-V1](https://huggingface.co/WhereIsAI/UAE-Large-V1). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
395
+
396
+ ## Model Details
397
+
398
+ ### Model Description
399
+ - **Model Type:** Sentence Transformer
400
+ - **Base model:** [WhereIsAI/UAE-Large-V1](https://huggingface.co/WhereIsAI/UAE-Large-V1) <!-- at revision 52d9e291d9fc7fc7f5276ff077b26fd1880c7c4f -->
401
+ - **Maximum Sequence Length:** 512 tokens
402
+ - **Output Dimensionality:** 1024 tokens
403
+ - **Similarity Function:** Cosine Similarity
404
+ <!-- - **Training Dataset:** Unknown -->
405
+ - **Language:** en
406
+ - **License:** apache-2.0
407
+
408
+ ### Model Sources
409
+
410
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
411
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
412
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
413
+
414
+ ### Full Model Architecture
415
+
416
+ ```
417
+ SentenceTransformer(
418
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
419
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
420
+ )
421
+ ```
422
+
423
+ ## Usage
424
+
425
+ ### Direct Usage (Sentence Transformers)
426
+
427
+ First install the Sentence Transformers library:
428
+
429
+ ```bash
430
+ pip install -U sentence-transformers
431
+ ```
432
+
433
+ Then you can load this model and run inference.
434
+ ```python
435
+ from sentence_transformers import SentenceTransformer
436
+
437
+ # Download from the 🤗 Hub
438
+ model = SentenceTransformer("rbhatia46/UAE-Large-V1-financial-rag-matryoshka")
439
+ # Run inference
440
+ sentences = [
441
+ 'According to Johnson & Johnson’s 2024 guidance report, their pharmaceutical sector was projected to grow by 7% in 2023 after considering crucial factors like the overall market demand, introduction of new drugs and potential impact of patent expirations.',
442
+ 'What was the projected growth of Johnson & Johnson’s pharmaceutical sector in 2023?',
443
+ 'How is JPMorgan Chase & Co. improving its cybersecurity measures?',
444
+ ]
445
+ embeddings = model.encode(sentences)
446
+ print(embeddings.shape)
447
+ # [3, 1024]
448
+
449
+ # Get the similarity scores for the embeddings
450
+ similarities = model.similarity(embeddings, embeddings)
451
+ print(similarities.shape)
452
+ # [3, 3]
453
+ ```
454
+
455
+ <!--
456
+ ### Direct Usage (Transformers)
457
+
458
+ <details><summary>Click to see the direct usage in Transformers</summary>
459
+
460
+ </details>
461
+ -->
462
+
463
+ <!--
464
+ ### Downstream Usage (Sentence Transformers)
465
+
466
+ You can finetune this model on your own dataset.
467
+
468
+ <details><summary>Click to expand</summary>
469
+
470
+ </details>
471
+ -->
472
+
473
+ <!--
474
+ ### Out-of-Scope Use
475
+
476
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
477
+ -->
478
+
479
+ ## Evaluation
480
+
481
+ ### Metrics
482
+
483
+ #### Information Retrieval
484
+ * Dataset: `dim_1024`
485
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
486
+
487
+ | Metric | Value |
488
+ |:--------------------|:-----------|
489
+ | cosine_accuracy@1 | 0.8316 |
490
+ | cosine_accuracy@3 | 0.9326 |
491
+ | cosine_accuracy@5 | 0.9663 |
492
+ | cosine_accuracy@10 | 0.9896 |
493
+ | cosine_precision@1 | 0.8316 |
494
+ | cosine_precision@3 | 0.3109 |
495
+ | cosine_precision@5 | 0.1933 |
496
+ | cosine_precision@10 | 0.099 |
497
+ | cosine_recall@1 | 0.8316 |
498
+ | cosine_recall@3 | 0.9326 |
499
+ | cosine_recall@5 | 0.9663 |
500
+ | cosine_recall@10 | 0.9896 |
501
+ | cosine_ndcg@10 | 0.9114 |
502
+ | cosine_mrr@10 | 0.8861 |
503
+ | **cosine_map@100** | **0.8866** |
504
+
505
+ #### Information Retrieval
506
+ * Dataset: `dim_768`
507
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
508
+
509
+ | Metric | Value |
510
+ |:--------------------|:-----------|
511
+ | cosine_accuracy@1 | 0.829 |
512
+ | cosine_accuracy@3 | 0.9326 |
513
+ | cosine_accuracy@5 | 0.9663 |
514
+ | cosine_accuracy@10 | 0.9845 |
515
+ | cosine_precision@1 | 0.829 |
516
+ | cosine_precision@3 | 0.3109 |
517
+ | cosine_precision@5 | 0.1933 |
518
+ | cosine_precision@10 | 0.0984 |
519
+ | cosine_recall@1 | 0.829 |
520
+ | cosine_recall@3 | 0.9326 |
521
+ | cosine_recall@5 | 0.9663 |
522
+ | cosine_recall@10 | 0.9845 |
523
+ | cosine_ndcg@10 | 0.9098 |
524
+ | cosine_mrr@10 | 0.8854 |
525
+ | **cosine_map@100** | **0.8863** |
526
+
527
+ #### Information Retrieval
528
+ * Dataset: `dim_512`
529
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
530
+
531
+ | Metric | Value |
532
+ |:--------------------|:-----------|
533
+ | cosine_accuracy@1 | 0.8238 |
534
+ | cosine_accuracy@3 | 0.9378 |
535
+ | cosine_accuracy@5 | 0.9637 |
536
+ | cosine_accuracy@10 | 0.9845 |
537
+ | cosine_precision@1 | 0.8238 |
538
+ | cosine_precision@3 | 0.3126 |
539
+ | cosine_precision@5 | 0.1927 |
540
+ | cosine_precision@10 | 0.0984 |
541
+ | cosine_recall@1 | 0.8238 |
542
+ | cosine_recall@3 | 0.9378 |
543
+ | cosine_recall@5 | 0.9637 |
544
+ | cosine_recall@10 | 0.9845 |
545
+ | cosine_ndcg@10 | 0.9085 |
546
+ | cosine_mrr@10 | 0.8836 |
547
+ | **cosine_map@100** | **0.8844** |
548
+
549
+ #### Information Retrieval
550
+ * Dataset: `dim_256`
551
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
552
+
553
+ | Metric | Value |
554
+ |:--------------------|:-----------|
555
+ | cosine_accuracy@1 | 0.8212 |
556
+ | cosine_accuracy@3 | 0.9326 |
557
+ | cosine_accuracy@5 | 0.9611 |
558
+ | cosine_accuracy@10 | 0.9793 |
559
+ | cosine_precision@1 | 0.8212 |
560
+ | cosine_precision@3 | 0.3109 |
561
+ | cosine_precision@5 | 0.1922 |
562
+ | cosine_precision@10 | 0.0979 |
563
+ | cosine_recall@1 | 0.8212 |
564
+ | cosine_recall@3 | 0.9326 |
565
+ | cosine_recall@5 | 0.9611 |
566
+ | cosine_recall@10 | 0.9793 |
567
+ | cosine_ndcg@10 | 0.9051 |
568
+ | cosine_mrr@10 | 0.8807 |
569
+ | **cosine_map@100** | **0.8817** |
570
+
571
+ #### Information Retrieval
572
+ * Dataset: `dim_128`
573
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
574
+
575
+ | Metric | Value |
576
+ |:--------------------|:-----------|
577
+ | cosine_accuracy@1 | 0.8187 |
578
+ | cosine_accuracy@3 | 0.9352 |
579
+ | cosine_accuracy@5 | 0.9611 |
580
+ | cosine_accuracy@10 | 0.9793 |
581
+ | cosine_precision@1 | 0.8187 |
582
+ | cosine_precision@3 | 0.3117 |
583
+ | cosine_precision@5 | 0.1922 |
584
+ | cosine_precision@10 | 0.0979 |
585
+ | cosine_recall@1 | 0.8187 |
586
+ | cosine_recall@3 | 0.9352 |
587
+ | cosine_recall@5 | 0.9611 |
588
+ | cosine_recall@10 | 0.9793 |
589
+ | cosine_ndcg@10 | 0.9031 |
590
+ | cosine_mrr@10 | 0.8782 |
591
+ | **cosine_map@100** | **0.8793** |
592
+
593
+ #### Information Retrieval
594
+ * Dataset: `dim_64`
595
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
596
+
597
+ | Metric | Value |
598
+ |:--------------------|:-----------|
599
+ | cosine_accuracy@1 | 0.7979 |
600
+ | cosine_accuracy@3 | 0.9223 |
601
+ | cosine_accuracy@5 | 0.9585 |
602
+ | cosine_accuracy@10 | 0.9793 |
603
+ | cosine_precision@1 | 0.7979 |
604
+ | cosine_precision@3 | 0.3074 |
605
+ | cosine_precision@5 | 0.1917 |
606
+ | cosine_precision@10 | 0.0979 |
607
+ | cosine_recall@1 | 0.7979 |
608
+ | cosine_recall@3 | 0.9223 |
609
+ | cosine_recall@5 | 0.9585 |
610
+ | cosine_recall@10 | 0.9793 |
611
+ | cosine_ndcg@10 | 0.8936 |
612
+ | cosine_mrr@10 | 0.8655 |
613
+ | **cosine_map@100** | **0.8667** |
614
+
615
+ <!--
616
+ ## Bias, Risks and Limitations
617
+
618
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
619
+ -->
620
+
621
+ <!--
622
+ ### Recommendations
623
+
624
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
625
+ -->
626
+
627
+ ## Training Details
628
+
629
+ ### Training Dataset
630
+
631
+ #### Unnamed Dataset
632
+
633
+
634
+ * Size: 3,474 training samples
635
+ * Columns: <code>positive</code> and <code>anchor</code>
636
+ * Approximate statistics based on the first 1000 samples:
637
+ | | positive | anchor |
638
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
639
+ | type | string | string |
640
+ | details | <ul><li>min: 15 tokens</li><li>mean: 44.84 tokens</li><li>max: 112 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 18.34 tokens</li><li>max: 32 tokens</li></ul> |
641
+ * Samples:
642
+ | positive | anchor |
643
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|
644
+ | <code>Exxon Mobil faces substantial risk factors including fluctuating market prices for oil and gas, regulatory environment changes and the potential for catastrophic accidents such as oil spills.</code> | <code>What is the key risk factor faced by Exxon Mobil in the energy sector?</code> |
645
+ | <code>Tesla’s remarkable revenue growth in 2023 is largely driven by its robust electric vehicle sales in China and the strong demand for its energy storage products.</code> | <code>What is the main reason behind Tesla’s revenue growth in 2023?</code> |
646
+ | <code>Amazon is expected to see a sales growth of 23% in the next financial year, driven by the increased demand for their ecommerce business and strong growth in AWS. This projection is subject to changes in the market condition and customer spending patterns.</code> | <code>What is the projected sales growth for Amazon in the next financial year?</code> |
647
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
648
+ ```json
649
+ {
650
+ "loss": "MultipleNegativesRankingLoss",
651
+ "matryoshka_dims": [
652
+ 1024,
653
+ 768,
654
+ 512,
655
+ 256,
656
+ 128,
657
+ 64
658
+ ],
659
+ "matryoshka_weights": [
660
+ 1,
661
+ 1,
662
+ 1,
663
+ 1,
664
+ 1,
665
+ 1
666
+ ],
667
+ "n_dims_per_step": -1
668
+ }
669
+ ```
670
+
671
+ ### Training Hyperparameters
672
+ #### Non-Default Hyperparameters
673
+
674
+ - `eval_strategy`: epoch
675
+ - `per_device_train_batch_size`: 32
676
+ - `per_device_eval_batch_size`: 16
677
+ - `gradient_accumulation_steps`: 16
678
+ - `learning_rate`: 2e-05
679
+ - `num_train_epochs`: 4
680
+ - `lr_scheduler_type`: cosine
681
+ - `warmup_ratio`: 0.1
682
+ - `bf16`: True
683
+ - `tf32`: True
684
+ - `load_best_model_at_end`: True
685
+ - `optim`: adamw_torch_fused
686
+ - `batch_sampler`: no_duplicates
687
+
688
+ #### All Hyperparameters
689
+ <details><summary>Click to expand</summary>
690
+
691
+ - `overwrite_output_dir`: False
692
+ - `do_predict`: False
693
+ - `eval_strategy`: epoch
694
+ - `prediction_loss_only`: True
695
+ - `per_device_train_batch_size`: 32
696
+ - `per_device_eval_batch_size`: 16
697
+ - `per_gpu_train_batch_size`: None
698
+ - `per_gpu_eval_batch_size`: None
699
+ - `gradient_accumulation_steps`: 16
700
+ - `eval_accumulation_steps`: None
701
+ - `learning_rate`: 2e-05
702
+ - `weight_decay`: 0.0
703
+ - `adam_beta1`: 0.9
704
+ - `adam_beta2`: 0.999
705
+ - `adam_epsilon`: 1e-08
706
+ - `max_grad_norm`: 1.0
707
+ - `num_train_epochs`: 4
708
+ - `max_steps`: -1
709
+ - `lr_scheduler_type`: cosine
710
+ - `lr_scheduler_kwargs`: {}
711
+ - `warmup_ratio`: 0.1
712
+ - `warmup_steps`: 0
713
+ - `log_level`: passive
714
+ - `log_level_replica`: warning
715
+ - `log_on_each_node`: True
716
+ - `logging_nan_inf_filter`: True
717
+ - `save_safetensors`: True
718
+ - `save_on_each_node`: False
719
+ - `save_only_model`: False
720
+ - `restore_callback_states_from_checkpoint`: False
721
+ - `no_cuda`: False
722
+ - `use_cpu`: False
723
+ - `use_mps_device`: False
724
+ - `seed`: 42
725
+ - `data_seed`: None
726
+ - `jit_mode_eval`: False
727
+ - `use_ipex`: False
728
+ - `bf16`: True
729
+ - `fp16`: False
730
+ - `fp16_opt_level`: O1
731
+ - `half_precision_backend`: auto
732
+ - `bf16_full_eval`: False
733
+ - `fp16_full_eval`: False
734
+ - `tf32`: True
735
+ - `local_rank`: 0
736
+ - `ddp_backend`: None
737
+ - `tpu_num_cores`: None
738
+ - `tpu_metrics_debug`: False
739
+ - `debug`: []
740
+ - `dataloader_drop_last`: False
741
+ - `dataloader_num_workers`: 0
742
+ - `dataloader_prefetch_factor`: None
743
+ - `past_index`: -1
744
+ - `disable_tqdm`: False
745
+ - `remove_unused_columns`: True
746
+ - `label_names`: None
747
+ - `load_best_model_at_end`: True
748
+ - `ignore_data_skip`: False
749
+ - `fsdp`: []
750
+ - `fsdp_min_num_params`: 0
751
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
752
+ - `fsdp_transformer_layer_cls_to_wrap`: None
753
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
754
+ - `deepspeed`: None
755
+ - `label_smoothing_factor`: 0.0
756
+ - `optim`: adamw_torch_fused
757
+ - `optim_args`: None
758
+ - `adafactor`: False
759
+ - `group_by_length`: False
760
+ - `length_column_name`: length
761
+ - `ddp_find_unused_parameters`: None
762
+ - `ddp_bucket_cap_mb`: None
763
+ - `ddp_broadcast_buffers`: False
764
+ - `dataloader_pin_memory`: True
765
+ - `dataloader_persistent_workers`: False
766
+ - `skip_memory_metrics`: True
767
+ - `use_legacy_prediction_loop`: False
768
+ - `push_to_hub`: False
769
+ - `resume_from_checkpoint`: None
770
+ - `hub_model_id`: None
771
+ - `hub_strategy`: every_save
772
+ - `hub_private_repo`: False
773
+ - `hub_always_push`: False
774
+ - `gradient_checkpointing`: False
775
+ - `gradient_checkpointing_kwargs`: None
776
+ - `include_inputs_for_metrics`: False
777
+ - `eval_do_concat_batches`: True
778
+ - `fp16_backend`: auto
779
+ - `push_to_hub_model_id`: None
780
+ - `push_to_hub_organization`: None
781
+ - `mp_parameters`:
782
+ - `auto_find_batch_size`: False
783
+ - `full_determinism`: False
784
+ - `torchdynamo`: None
785
+ - `ray_scope`: last
786
+ - `ddp_timeout`: 1800
787
+ - `torch_compile`: False
788
+ - `torch_compile_backend`: None
789
+ - `torch_compile_mode`: None
790
+ - `dispatch_batches`: None
791
+ - `split_batches`: None
792
+ - `include_tokens_per_second`: False
793
+ - `include_num_input_tokens_seen`: False
794
+ - `neftune_noise_alpha`: None
795
+ - `optim_target_modules`: None
796
+ - `batch_eval_metrics`: False
797
+ - `batch_sampler`: no_duplicates
798
+ - `multi_dataset_batch_sampler`: proportional
799
+
800
+ </details>
801
+
802
+ ### Training Logs
803
+ | Epoch | Step | Training Loss | dim_1024_cosine_map@100 | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
804
+ |:----------:|:------:|:-------------:|:-----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
805
+ | 0.8807 | 6 | - | 0.8708 | 0.8499 | 0.8647 | 0.8705 | 0.8307 | 0.8700 |
806
+ | 1.4679 | 10 | 0.7358 | - | - | - | - | - | - |
807
+ | 1.9083 | 13 | - | 0.8848 | 0.8724 | 0.8782 | 0.8861 | 0.8617 | 0.8855 |
808
+ | **2.9358** | **20** | **0.1483** | **0.8865** | **0.8793** | **0.8814** | **0.8857** | **0.8667** | **0.8863** |
809
+ | 3.5229 | 24 | - | 0.8866 | 0.8793 | 0.8817 | 0.8844 | 0.8667 | 0.8863 |
810
+
811
+ * The bold row denotes the saved checkpoint.
812
+
813
+ ### Framework Versions
814
+ - Python: 3.10.6
815
+ - Sentence Transformers: 3.0.1
816
+ - Transformers: 4.41.2
817
+ - PyTorch: 2.1.2+cu121
818
+ - Accelerate: 0.32.1
819
+ - Datasets: 2.19.1
820
+ - Tokenizers: 0.19.1
821
+
822
+ ## Citation
823
+
824
+ ### BibTeX
825
+
826
+ #### Sentence Transformers
827
+ ```bibtex
828
+ @inproceedings{reimers-2019-sentence-bert,
829
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
830
+ author = "Reimers, Nils and Gurevych, Iryna",
831
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
832
+ month = "11",
833
+ year = "2019",
834
+ publisher = "Association for Computational Linguistics",
835
+ url = "https://arxiv.org/abs/1908.10084",
836
+ }
837
+ ```
838
+
839
+ #### MatryoshkaLoss
840
+ ```bibtex
841
+ @misc{kusupati2024matryoshka,
842
+ title={Matryoshka Representation Learning},
843
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
844
+ year={2024},
845
+ eprint={2205.13147},
846
+ archivePrefix={arXiv},
847
+ primaryClass={cs.LG}
848
+ }
849
+ ```
850
+
851
+ #### MultipleNegativesRankingLoss
852
+ ```bibtex
853
+ @misc{henderson2017efficient,
854
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
855
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
856
+ year={2017},
857
+ eprint={1705.00652},
858
+ archivePrefix={arXiv},
859
+ primaryClass={cs.CL}
860
+ }
861
+ ```
862
+
863
+ <!--
864
+ ## Glossary
865
+
866
+ *Clearly define terms in order to be accessible across audiences.*
867
+ -->
868
+
869
+ <!--
870
+ ## Model Card Authors
871
+
872
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
873
+ -->
874
+
875
+ <!--
876
+ ## Model Card Contact
877
+
878
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
879
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "WhereIsAI/UAE-Large-V1",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 4096,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 16,
18
+ "num_hidden_layers": 24,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.41.2",
23
+ "type_vocab_size": 2,
24
+ "use_cache": false,
25
+ "vocab_size": 30522
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:21cc373f00100e294aa091d1f585dd25d8ee0a5d84af30b84c5af39bea829392
3
+ size 1340612432
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff