Training in progress, step 214, checkpoint
Browse files- checkpoint-214/README.md +279 -99
- checkpoint-214/optimizer.pt +1 -1
- checkpoint-214/pytorch_model.bin +1 -1
- checkpoint-214/rng_state.pth +1 -1
- checkpoint-214/scheduler.pt +1 -1
- checkpoint-214/trainer_state.json +0 -0
- checkpoint-214/training_args.bin +1 -1
checkpoint-214/README.md
CHANGED
@@ -164,34 +164,34 @@ model-index:
|
|
164 |
type: sts-test
|
165 |
metrics:
|
166 |
- type: pearson_cosine
|
167 |
-
value: 0.
|
168 |
name: Pearson Cosine
|
169 |
- type: spearman_cosine
|
170 |
-
value: 0.
|
171 |
name: Spearman Cosine
|
172 |
- type: pearson_manhattan
|
173 |
-
value: 0.
|
174 |
name: Pearson Manhattan
|
175 |
- type: spearman_manhattan
|
176 |
-
value: 0.
|
177 |
name: Spearman Manhattan
|
178 |
- type: pearson_euclidean
|
179 |
-
value: 0.
|
180 |
name: Pearson Euclidean
|
181 |
- type: spearman_euclidean
|
182 |
-
value: 0.
|
183 |
name: Spearman Euclidean
|
184 |
- type: pearson_dot
|
185 |
-
value: 0.
|
186 |
name: Pearson Dot
|
187 |
- type: spearman_dot
|
188 |
-
value: 0.
|
189 |
name: Spearman Dot
|
190 |
- type: pearson_max
|
191 |
-
value: 0.
|
192 |
name: Pearson Max
|
193 |
- type: spearman_max
|
194 |
-
value: 0.
|
195 |
name: Spearman Max
|
196 |
- task:
|
197 |
type: triplet
|
@@ -223,79 +223,79 @@ model-index:
|
|
223 |
type: VitaminC
|
224 |
metrics:
|
225 |
- type: cosine_accuracy
|
226 |
-
value: 0.
|
227 |
name: Cosine Accuracy
|
228 |
- type: cosine_accuracy_threshold
|
229 |
-
value: 0.
|
230 |
name: Cosine Accuracy Threshold
|
231 |
- type: cosine_f1
|
232 |
-
value: 0.
|
233 |
name: Cosine F1
|
234 |
- type: cosine_f1_threshold
|
235 |
-
value: 0.
|
236 |
name: Cosine F1 Threshold
|
237 |
- type: cosine_precision
|
238 |
-
value: 0.
|
239 |
name: Cosine Precision
|
240 |
- type: cosine_recall
|
241 |
value: 1.0
|
242 |
name: Cosine Recall
|
243 |
- type: cosine_ap
|
244 |
-
value: 0.
|
245 |
name: Cosine Ap
|
246 |
- type: dot_accuracy
|
247 |
-
value: 0.
|
248 |
name: Dot Accuracy
|
249 |
- type: dot_accuracy_threshold
|
250 |
-
value:
|
251 |
name: Dot Accuracy Threshold
|
252 |
- type: dot_f1
|
253 |
-
value: 0.
|
254 |
name: Dot F1
|
255 |
- type: dot_f1_threshold
|
256 |
-
value:
|
257 |
name: Dot F1 Threshold
|
258 |
- type: dot_precision
|
259 |
-
value: 0.
|
260 |
name: Dot Precision
|
261 |
- type: dot_recall
|
262 |
-
value: 0.
|
263 |
name: Dot Recall
|
264 |
- type: dot_ap
|
265 |
-
value: 0.
|
266 |
name: Dot Ap
|
267 |
- type: manhattan_accuracy
|
268 |
-
value: 0.
|
269 |
name: Manhattan Accuracy
|
270 |
- type: manhattan_accuracy_threshold
|
271 |
-
value:
|
272 |
name: Manhattan Accuracy Threshold
|
273 |
- type: manhattan_f1
|
274 |
-
value: 0.
|
275 |
name: Manhattan F1
|
276 |
- type: manhattan_f1_threshold
|
277 |
-
value:
|
278 |
name: Manhattan F1 Threshold
|
279 |
- type: manhattan_precision
|
280 |
-
value: 0.
|
281 |
name: Manhattan Precision
|
282 |
- type: manhattan_recall
|
283 |
-
value: 0
|
284 |
name: Manhattan Recall
|
285 |
- type: manhattan_ap
|
286 |
-
value: 0.
|
287 |
name: Manhattan Ap
|
288 |
- type: euclidean_accuracy
|
289 |
-
value: 0.
|
290 |
name: Euclidean Accuracy
|
291 |
- type: euclidean_accuracy_threshold
|
292 |
-
value:
|
293 |
name: Euclidean Accuracy Threshold
|
294 |
- type: euclidean_f1
|
295 |
value: 0.6577540106951871
|
296 |
name: Euclidean F1
|
297 |
- type: euclidean_f1_threshold
|
298 |
-
value:
|
299 |
name: Euclidean F1 Threshold
|
300 |
- type: euclidean_precision
|
301 |
value: 0.4900398406374502
|
@@ -304,28 +304,28 @@ model-index:
|
|
304 |
value: 1.0
|
305 |
name: Euclidean Recall
|
306 |
- type: euclidean_ap
|
307 |
-
value: 0.
|
308 |
name: Euclidean Ap
|
309 |
- type: max_accuracy
|
310 |
-
value: 0.
|
311 |
name: Max Accuracy
|
312 |
- type: max_accuracy_threshold
|
313 |
-
value:
|
314 |
name: Max Accuracy Threshold
|
315 |
- type: max_f1
|
316 |
value: 0.6577540106951871
|
317 |
name: Max F1
|
318 |
- type: max_f1_threshold
|
319 |
-
value:
|
320 |
name: Max F1 Threshold
|
321 |
- type: max_precision
|
322 |
-
value: 0.
|
323 |
name: Max Precision
|
324 |
- type: max_recall
|
325 |
value: 1.0
|
326 |
name: Max Recall
|
327 |
- type: max_ap
|
328 |
-
value: 0.
|
329 |
name: Max Ap
|
330 |
---
|
331 |
|
@@ -388,7 +388,7 @@ Then you can load this model and run inference.
|
|
388 |
from sentence_transformers import SentenceTransformer
|
389 |
|
390 |
# Download from the 🤗 Hub
|
391 |
-
model = SentenceTransformer("bobox/DeBERTa-small-ST-v1-toytest
|
392 |
# Run inference
|
393 |
sentences = [
|
394 |
'who did ben assault in home and away',
|
@@ -437,18 +437,18 @@ You can finetune this model on your own dataset.
|
|
437 |
* Dataset: `sts-test`
|
438 |
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
|
439 |
|
440 |
-
| Metric | Value
|
441 |
-
|
442 |
-
| pearson_cosine | 0.
|
443 |
-
| **spearman_cosine** | **0.
|
444 |
-
| pearson_manhattan | 0.
|
445 |
-
| spearman_manhattan | 0.
|
446 |
-
| pearson_euclidean | 0.
|
447 |
-
| spearman_euclidean | 0.
|
448 |
-
| pearson_dot | 0.
|
449 |
-
| spearman_dot | 0.
|
450 |
-
| pearson_max | 0.
|
451 |
-
| spearman_max | 0.
|
452 |
|
453 |
#### Triplet
|
454 |
* Dataset: `NLI-v2`
|
@@ -466,43 +466,43 @@ You can finetune this model on your own dataset.
|
|
466 |
* Dataset: `VitaminC`
|
467 |
* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
|
468 |
|
469 |
-
| Metric | Value
|
470 |
-
|
471 |
-
| cosine_accuracy | 0.
|
472 |
-
| cosine_accuracy_threshold | 0.
|
473 |
-
| cosine_f1 | 0.
|
474 |
-
| cosine_f1_threshold | 0.
|
475 |
-
| cosine_precision | 0.
|
476 |
-
| cosine_recall | 1.0
|
477 |
-
| cosine_ap | 0.
|
478 |
-
| dot_accuracy | 0.
|
479 |
-
| dot_accuracy_threshold |
|
480 |
-
| dot_f1 | 0.
|
481 |
-
| dot_f1_threshold |
|
482 |
-
| dot_precision | 0.
|
483 |
-
| dot_recall | 0.
|
484 |
-
| dot_ap | 0.
|
485 |
-
| manhattan_accuracy | 0.
|
486 |
-
| manhattan_accuracy_threshold |
|
487 |
-
| manhattan_f1 | 0.
|
488 |
-
| manhattan_f1_threshold |
|
489 |
-
| manhattan_precision | 0.
|
490 |
-
| manhattan_recall | 0
|
491 |
-
| manhattan_ap | 0.
|
492 |
-
| euclidean_accuracy | 0.
|
493 |
-
| euclidean_accuracy_threshold |
|
494 |
-
| euclidean_f1 | 0.6578
|
495 |
-
| euclidean_f1_threshold |
|
496 |
-
| euclidean_precision | 0.49
|
497 |
-
| euclidean_recall | 1.0
|
498 |
-
| euclidean_ap | 0.
|
499 |
-
| max_accuracy | 0.
|
500 |
-
| max_accuracy_threshold |
|
501 |
-
| max_f1 | 0.6578
|
502 |
-
| max_f1_threshold |
|
503 |
-
| max_precision | 0.
|
504 |
-
| max_recall | 1.0
|
505 |
-
| **max_ap** | **0.
|
506 |
|
507 |
<!--
|
508 |
## Bias, Risks and Limitations
|
@@ -1151,14 +1151,14 @@ You can finetune this model on your own dataset.
|
|
1151 |
#### Non-Default Hyperparameters
|
1152 |
|
1153 |
- `eval_strategy`: steps
|
1154 |
-
- `per_device_train_batch_size`:
|
1155 |
- `per_device_eval_batch_size`: 64
|
1156 |
-
- `gradient_accumulation_steps`:
|
1157 |
- `learning_rate`: 4e-05
|
1158 |
-
- `weight_decay`:
|
1159 |
- `lr_scheduler_type`: cosine_with_min_lr
|
1160 |
-
- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr':
|
1161 |
-
- `warmup_ratio`: 0.
|
1162 |
- `save_safetensors`: False
|
1163 |
- `fp16`: True
|
1164 |
- `push_to_hub`: True
|
@@ -1173,14 +1173,14 @@ You can finetune this model on your own dataset.
|
|
1173 |
- `do_predict`: False
|
1174 |
- `eval_strategy`: steps
|
1175 |
- `prediction_loss_only`: True
|
1176 |
-
- `per_device_train_batch_size`:
|
1177 |
- `per_device_eval_batch_size`: 64
|
1178 |
- `per_gpu_train_batch_size`: None
|
1179 |
- `per_gpu_eval_batch_size`: None
|
1180 |
-
- `gradient_accumulation_steps`:
|
1181 |
- `eval_accumulation_steps`: None
|
1182 |
- `learning_rate`: 4e-05
|
1183 |
-
- `weight_decay`:
|
1184 |
- `adam_beta1`: 0.9
|
1185 |
- `adam_beta2`: 0.999
|
1186 |
- `adam_epsilon`: 1e-08
|
@@ -1188,8 +1188,8 @@ You can finetune this model on your own dataset.
|
|
1188 |
- `num_train_epochs`: 3
|
1189 |
- `max_steps`: -1
|
1190 |
- `lr_scheduler_type`: cosine_with_min_lr
|
1191 |
-
- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr':
|
1192 |
-
- `warmup_ratio`: 0.
|
1193 |
- `warmup_steps`: 0
|
1194 |
- `log_level`: passive
|
1195 |
- `log_level_replica`: warning
|
@@ -1282,6 +1282,8 @@ You can finetune this model on your own dataset.
|
|
1282 |
</details>
|
1283 |
|
1284 |
### Training Logs
|
|
|
|
|
1285 |
| Epoch | Step | Training Loss | vitaminc-pairs loss | trivia pairs loss | xsum-pairs loss | paws-pos loss | sciq pairs loss | msmarco pairs loss | openbookqa pairs loss | gooaq pairs loss | nq pairs loss | scitail-pairs-pos loss | qasc pairs loss | negation-triplets loss | NLI-v2_max_accuracy | VitaminC_max_ap | sts-test_spearman_cosine |
|
1286 |
|:------:|:----:|:-------------:|:-------------------:|:-----------------:|:---------------:|:-------------:|:---------------:|:------------------:|:---------------------:|:----------------:|:-------------:|:----------------------:|:---------------:|:----------------------:|:-------------------:|:---------------:|:------------------------:|
|
1287 |
| 0.0169 | 3 | 7.2372 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
@@ -1355,7 +1357,185 @@ You can finetune this model on your own dataset.
|
|
1355 |
| 1.1687 | 207 | 0.8365 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1356 |
| 1.1856 | 210 | 1.1012 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1357 |
| 1.2025 | 213 | 1.0016 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1358 |
|
|
|
1359 |
|
1360 |
### Framework Versions
|
1361 |
- Python: 3.10.13
|
|
|
164 |
type: sts-test
|
165 |
metrics:
|
166 |
- type: pearson_cosine
|
167 |
+
value: 0.854968805652805
|
168 |
name: Pearson Cosine
|
169 |
- type: spearman_cosine
|
170 |
+
value: 0.8956917253507228
|
171 |
name: Spearman Cosine
|
172 |
- type: pearson_manhattan
|
173 |
+
value: 0.8864271118397893
|
174 |
name: Pearson Manhattan
|
175 |
- type: spearman_manhattan
|
176 |
+
value: 0.890112288382125
|
177 |
name: Spearman Manhattan
|
178 |
- type: pearson_euclidean
|
179 |
+
value: 0.8853384519331917
|
180 |
name: Pearson Euclidean
|
181 |
- type: spearman_euclidean
|
182 |
+
value: 0.8875533307992096
|
183 |
name: Spearman Euclidean
|
184 |
- type: pearson_dot
|
185 |
+
value: 0.8534110565503882
|
186 |
name: Pearson Dot
|
187 |
- type: spearman_dot
|
188 |
+
value: 0.877726389450295
|
189 |
name: Spearman Dot
|
190 |
- type: pearson_max
|
191 |
+
value: 0.8864271118397893
|
192 |
name: Pearson Max
|
193 |
- type: spearman_max
|
194 |
+
value: 0.8956917253507228
|
195 |
name: Spearman Max
|
196 |
- task:
|
197 |
type: triplet
|
|
|
223 |
type: VitaminC
|
224 |
metrics:
|
225 |
- type: cosine_accuracy
|
226 |
+
value: 0.58984375
|
227 |
name: Cosine Accuracy
|
228 |
- type: cosine_accuracy_threshold
|
229 |
+
value: 0.8360881209373474
|
230 |
name: Cosine Accuracy Threshold
|
231 |
- type: cosine_f1
|
232 |
+
value: 0.6559999999999999
|
233 |
name: Cosine F1
|
234 |
- type: cosine_f1_threshold
|
235 |
+
value: 0.26484909653663635
|
236 |
name: Cosine F1 Threshold
|
237 |
- type: cosine_precision
|
238 |
+
value: 0.4880952380952381
|
239 |
name: Cosine Precision
|
240 |
- type: cosine_recall
|
241 |
value: 1.0
|
242 |
name: Cosine Recall
|
243 |
- type: cosine_ap
|
244 |
+
value: 0.5601848253252508
|
245 |
name: Cosine Ap
|
246 |
- type: dot_accuracy
|
247 |
+
value: 0.58203125
|
248 |
name: Dot Accuracy
|
249 |
- type: dot_accuracy_threshold
|
250 |
+
value: 314.279052734375
|
251 |
name: Dot Accuracy Threshold
|
252 |
- type: dot_f1
|
253 |
+
value: 0.6558265582655827
|
254 |
name: Dot F1
|
255 |
- type: dot_f1_threshold
|
256 |
+
value: 126.1304931640625
|
257 |
name: Dot F1 Threshold
|
258 |
- type: dot_precision
|
259 |
+
value: 0.491869918699187
|
260 |
name: Dot Precision
|
261 |
- type: dot_recall
|
262 |
+
value: 0.983739837398374
|
263 |
name: Dot Recall
|
264 |
- type: dot_ap
|
265 |
+
value: 0.5513292673695236
|
266 |
name: Dot Ap
|
267 |
- type: manhattan_accuracy
|
268 |
+
value: 0.57421875
|
269 |
name: Manhattan Accuracy
|
270 |
- type: manhattan_accuracy_threshold
|
271 |
+
value: 244.02972412109375
|
272 |
name: Manhattan Accuracy Threshold
|
273 |
- type: manhattan_f1
|
274 |
+
value: 0.6577540106951871
|
275 |
name: Manhattan F1
|
276 |
- type: manhattan_f1_threshold
|
277 |
+
value: 498.5762634277344
|
278 |
name: Manhattan F1 Threshold
|
279 |
- type: manhattan_precision
|
280 |
+
value: 0.4900398406374502
|
281 |
name: Manhattan Precision
|
282 |
- type: manhattan_recall
|
283 |
+
value: 1.0
|
284 |
name: Manhattan Recall
|
285 |
- type: manhattan_ap
|
286 |
+
value: 0.5562338006363409
|
287 |
name: Manhattan Ap
|
288 |
- type: euclidean_accuracy
|
289 |
+
value: 0.578125
|
290 |
name: Euclidean Accuracy
|
291 |
- type: euclidean_accuracy_threshold
|
292 |
+
value: 15.01893424987793
|
293 |
name: Euclidean Accuracy Threshold
|
294 |
- type: euclidean_f1
|
295 |
value: 0.6577540106951871
|
296 |
name: Euclidean F1
|
297 |
- type: euclidean_f1_threshold
|
298 |
+
value: 23.76571273803711
|
299 |
name: Euclidean F1 Threshold
|
300 |
- type: euclidean_precision
|
301 |
value: 0.4900398406374502
|
|
|
304 |
value: 1.0
|
305 |
name: Euclidean Recall
|
306 |
- type: euclidean_ap
|
307 |
+
value: 0.5549132214851141
|
308 |
name: Euclidean Ap
|
309 |
- type: max_accuracy
|
310 |
+
value: 0.58984375
|
311 |
name: Max Accuracy
|
312 |
- type: max_accuracy_threshold
|
313 |
+
value: 314.279052734375
|
314 |
name: Max Accuracy Threshold
|
315 |
- type: max_f1
|
316 |
value: 0.6577540106951871
|
317 |
name: Max F1
|
318 |
- type: max_f1_threshold
|
319 |
+
value: 498.5762634277344
|
320 |
name: Max F1 Threshold
|
321 |
- type: max_precision
|
322 |
+
value: 0.491869918699187
|
323 |
name: Max Precision
|
324 |
- type: max_recall
|
325 |
value: 1.0
|
326 |
name: Max Recall
|
327 |
- type: max_ap
|
328 |
+
value: 0.5601848253252508
|
329 |
name: Max Ap
|
330 |
---
|
331 |
|
|
|
388 |
from sentence_transformers import SentenceTransformer
|
389 |
|
390 |
# Download from the 🤗 Hub
|
391 |
+
model = SentenceTransformer("bobox/DeBERTa-small-ST-v1-toytest")
|
392 |
# Run inference
|
393 |
sentences = [
|
394 |
'who did ben assault in home and away',
|
|
|
437 |
* Dataset: `sts-test`
|
438 |
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
|
439 |
|
440 |
+
| Metric | Value |
|
441 |
+
|:--------------------|:-----------|
|
442 |
+
| pearson_cosine | 0.855 |
|
443 |
+
| **spearman_cosine** | **0.8957** |
|
444 |
+
| pearson_manhattan | 0.8864 |
|
445 |
+
| spearman_manhattan | 0.8901 |
|
446 |
+
| pearson_euclidean | 0.8853 |
|
447 |
+
| spearman_euclidean | 0.8876 |
|
448 |
+
| pearson_dot | 0.8534 |
|
449 |
+
| spearman_dot | 0.8777 |
|
450 |
+
| pearson_max | 0.8864 |
|
451 |
+
| spearman_max | 0.8957 |
|
452 |
|
453 |
#### Triplet
|
454 |
* Dataset: `NLI-v2`
|
|
|
466 |
* Dataset: `VitaminC`
|
467 |
* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
|
468 |
|
469 |
+
| Metric | Value |
|
470 |
+
|:-----------------------------|:-----------|
|
471 |
+
| cosine_accuracy | 0.5898 |
|
472 |
+
| cosine_accuracy_threshold | 0.8361 |
|
473 |
+
| cosine_f1 | 0.656 |
|
474 |
+
| cosine_f1_threshold | 0.2648 |
|
475 |
+
| cosine_precision | 0.4881 |
|
476 |
+
| cosine_recall | 1.0 |
|
477 |
+
| cosine_ap | 0.5602 |
|
478 |
+
| dot_accuracy | 0.582 |
|
479 |
+
| dot_accuracy_threshold | 314.2791 |
|
480 |
+
| dot_f1 | 0.6558 |
|
481 |
+
| dot_f1_threshold | 126.1305 |
|
482 |
+
| dot_precision | 0.4919 |
|
483 |
+
| dot_recall | 0.9837 |
|
484 |
+
| dot_ap | 0.5513 |
|
485 |
+
| manhattan_accuracy | 0.5742 |
|
486 |
+
| manhattan_accuracy_threshold | 244.0297 |
|
487 |
+
| manhattan_f1 | 0.6578 |
|
488 |
+
| manhattan_f1_threshold | 498.5763 |
|
489 |
+
| manhattan_precision | 0.49 |
|
490 |
+
| manhattan_recall | 1.0 |
|
491 |
+
| manhattan_ap | 0.5562 |
|
492 |
+
| euclidean_accuracy | 0.5781 |
|
493 |
+
| euclidean_accuracy_threshold | 15.0189 |
|
494 |
+
| euclidean_f1 | 0.6578 |
|
495 |
+
| euclidean_f1_threshold | 23.7657 |
|
496 |
+
| euclidean_precision | 0.49 |
|
497 |
+
| euclidean_recall | 1.0 |
|
498 |
+
| euclidean_ap | 0.5549 |
|
499 |
+
| max_accuracy | 0.5898 |
|
500 |
+
| max_accuracy_threshold | 314.2791 |
|
501 |
+
| max_f1 | 0.6578 |
|
502 |
+
| max_f1_threshold | 498.5763 |
|
503 |
+
| max_precision | 0.4919 |
|
504 |
+
| max_recall | 1.0 |
|
505 |
+
| **max_ap** | **0.5602** |
|
506 |
|
507 |
<!--
|
508 |
## Bias, Risks and Limitations
|
|
|
1151 |
#### Non-Default Hyperparameters
|
1152 |
|
1153 |
- `eval_strategy`: steps
|
1154 |
+
- `per_device_train_batch_size`: 320
|
1155 |
- `per_device_eval_batch_size`: 64
|
1156 |
+
- `gradient_accumulation_steps`: 4
|
1157 |
- `learning_rate`: 4e-05
|
1158 |
+
- `weight_decay`: 5e-05
|
1159 |
- `lr_scheduler_type`: cosine_with_min_lr
|
1160 |
+
- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 1e-05}
|
1161 |
+
- `warmup_ratio`: 0.15
|
1162 |
- `save_safetensors`: False
|
1163 |
- `fp16`: True
|
1164 |
- `push_to_hub`: True
|
|
|
1173 |
- `do_predict`: False
|
1174 |
- `eval_strategy`: steps
|
1175 |
- `prediction_loss_only`: True
|
1176 |
+
- `per_device_train_batch_size`: 320
|
1177 |
- `per_device_eval_batch_size`: 64
|
1178 |
- `per_gpu_train_batch_size`: None
|
1179 |
- `per_gpu_eval_batch_size`: None
|
1180 |
+
- `gradient_accumulation_steps`: 4
|
1181 |
- `eval_accumulation_steps`: None
|
1182 |
- `learning_rate`: 4e-05
|
1183 |
+
- `weight_decay`: 5e-05
|
1184 |
- `adam_beta1`: 0.9
|
1185 |
- `adam_beta2`: 0.999
|
1186 |
- `adam_epsilon`: 1e-08
|
|
|
1188 |
- `num_train_epochs`: 3
|
1189 |
- `max_steps`: -1
|
1190 |
- `lr_scheduler_type`: cosine_with_min_lr
|
1191 |
+
- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 1e-05}
|
1192 |
+
- `warmup_ratio`: 0.15
|
1193 |
- `warmup_steps`: 0
|
1194 |
- `log_level`: passive
|
1195 |
- `log_level_replica`: warning
|
|
|
1282 |
</details>
|
1283 |
|
1284 |
### Training Logs
|
1285 |
+
<details><summary>Click to expand</summary>
|
1286 |
+
|
1287 |
| Epoch | Step | Training Loss | vitaminc-pairs loss | trivia pairs loss | xsum-pairs loss | paws-pos loss | sciq pairs loss | msmarco pairs loss | openbookqa pairs loss | gooaq pairs loss | nq pairs loss | scitail-pairs-pos loss | qasc pairs loss | negation-triplets loss | NLI-v2_max_accuracy | VitaminC_max_ap | sts-test_spearman_cosine |
|
1288 |
|:------:|:----:|:-------------:|:-------------------:|:-----------------:|:---------------:|:-------------:|:---------------:|:------------------:|:---------------------:|:----------------:|:-------------:|:----------------------:|:---------------:|:----------------------:|:-------------------:|:---------------:|:------------------------:|
|
1289 |
| 0.0169 | 3 | 7.2372 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
|
|
1357 |
| 1.1687 | 207 | 0.8365 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1358 |
| 1.1856 | 210 | 1.1012 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1359 |
| 1.2025 | 213 | 1.0016 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1360 |
+
| 1.2195 | 216 | 1.0957 | 2.5466 | 1.1412 | 0.3591 | 0.0395 | 0.0517 | 0.5819 | 0.9366 | 0.9686 | 0.8172 | 0.1901 | 0.3075 | 1.9161 | 1.0 | 0.5385 | 0.8656 |
|
1361 |
+
| 1.2364 | 219 | 1.1273 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1362 |
+
| 1.2534 | 222 | 1.2568 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1363 |
+
| 1.2703 | 225 | 0.873 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1364 |
+
| 1.2872 | 228 | 1.0003 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1365 |
+
| 1.3042 | 231 | 1.142 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1366 |
+
| 1.3211 | 234 | 0.807 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1367 |
+
| 1.3380 | 237 | 1.0231 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1368 |
+
| 1.3550 | 240 | 0.797 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1369 |
+
| 1.3719 | 243 | 0.8473 | 2.5140 | 1.1067 | 0.2802 | 0.0343 | 0.0467 | 0.5559 | 0.8562 | 0.8929 | 0.7435 | 0.1750 | 0.2355 | 1.8629 | 1.0 | 0.5508 | 0.8687 |
|
1370 |
+
| 1.3888 | 246 | 0.9531 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1371 |
+
| 1.4058 | 249 | 0.9023 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1372 |
+
| 1.4227 | 252 | 0.8922 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1373 |
+
| 1.4397 | 255 | 0.9874 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1374 |
+
| 1.4566 | 258 | 0.8508 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1375 |
+
| 1.4735 | 261 | 0.7149 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1376 |
+
| 1.4905 | 264 | 0.894 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1377 |
+
| 1.5074 | 267 | 0.867 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1378 |
+
| 1.5243 | 270 | 0.7493 | 2.5574 | 1.0634 | 0.2217 | 0.0319 | 0.0435 | 0.5027 | 0.7999 | 0.8005 | 0.6530 | 0.1693 | 0.2443 | 1.8535 | 1.0 | 0.5499 | 0.8716 |
|
1379 |
+
| 1.5413 | 273 | 0.7974 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1380 |
+
| 1.5582 | 276 | 0.797 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1381 |
+
| 1.5752 | 279 | 0.6749 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1382 |
+
| 1.5921 | 282 | 0.9325 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1383 |
+
| 1.6090 | 285 | 0.8418 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1384 |
+
| 1.6260 | 288 | 1.0135 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1385 |
+
| 1.6429 | 291 | 0.6961 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1386 |
+
| 1.6598 | 294 | 0.9361 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1387 |
+
| 1.6768 | 297 | 0.6747 | 2.4871 | 0.9762 | 0.2242 | 0.0291 | 0.0396 | 0.5025 | 0.7668 | 0.7546 | 0.6427 | 0.1596 | 0.1963 | 1.7349 | 1.0 | 0.5461 | 0.8787 |
|
1388 |
+
| 1.6937 | 300 | 0.7786 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1389 |
+
| 1.7107 | 303 | 0.7171 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1390 |
+
| 1.7276 | 306 | 0.6627 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1391 |
+
| 1.7445 | 309 | 0.6711 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1392 |
+
| 1.7615 | 312 | 0.9076 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1393 |
+
| 1.7784 | 315 | 0.7414 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1394 |
+
| 1.7953 | 318 | 0.582 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1395 |
+
| 1.8123 | 321 | 0.6068 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1396 |
+
| 1.8292 | 324 | 0.6219 | 2.5197 | 1.0206 | 0.1630 | 0.0273 | 0.0383 | 0.4859 | 0.7109 | 0.7736 | 0.5533 | 0.1535 | 0.2044 | 1.7016 | 1.0 | 0.5532 | 0.8807 |
|
1397 |
+
| 1.8462 | 327 | 0.5862 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1398 |
+
| 1.8631 | 330 | 0.678 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1399 |
+
| 1.8800 | 333 | 0.6272 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1400 |
+
| 1.8970 | 336 | 0.5048 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1401 |
+
| 1.9139 | 339 | 0.7653 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1402 |
+
| 1.9308 | 342 | 0.6613 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1403 |
+
| 1.9478 | 345 | 0.6122 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1404 |
+
| 1.9647 | 348 | 0.5939 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1405 |
+
| 1.9817 | 351 | 0.6923 | 2.4379 | 0.9582 | 0.1464 | 0.0264 | 0.0382 | 0.4348 | 0.7554 | 0.7220 | 0.5432 | 0.1481 | 0.1640 | 1.7345 | 1.0 | 0.5560 | 0.8837 |
|
1406 |
+
| 1.9986 | 354 | 0.5712 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1407 |
+
| 2.0155 | 357 | 0.5969 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1408 |
+
| 2.0325 | 360 | 0.5881 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1409 |
+
| 2.0494 | 363 | 0.6005 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1410 |
+
| 2.0663 | 366 | 0.6066 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1411 |
+
| 2.0833 | 369 | 0.4921 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1412 |
+
| 2.1002 | 372 | 0.5354 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1413 |
+
| 2.1171 | 375 | 0.5602 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1414 |
+
| 2.1341 | 378 | 0.5686 | 2.3908 | 0.9614 | 0.1454 | 0.0271 | 0.0374 | 0.4246 | 0.7796 | 0.6965 | 0.5298 | 0.1401 | 0.1604 | 1.7678 | 1.0 | 0.5539 | 0.8804 |
|
1415 |
+
| 2.1510 | 381 | 0.6496 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1416 |
+
| 2.1680 | 384 | 0.4713 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1417 |
+
| 2.1849 | 387 | 0.6345 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1418 |
+
| 2.2018 | 390 | 0.5994 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1419 |
+
| 2.2188 | 393 | 0.6763 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1420 |
+
| 2.2357 | 396 | 0.7254 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1421 |
+
| 2.2526 | 399 | 0.8032 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1422 |
+
| 2.2696 | 402 | 0.4914 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1423 |
+
| 2.2865 | 405 | 0.6307 | 2.4388 | 0.9862 | 0.1308 | 0.0262 | 0.0379 | 0.3928 | 0.7434 | 0.6976 | 0.4998 | 0.1192 | 0.1466 | 1.7093 | 1.0 | 0.5533 | 0.8859 |
|
1424 |
+
| 2.3035 | 408 | 0.7493 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1425 |
+
| 2.3204 | 411 | 0.5139 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1426 |
+
| 2.3373 | 414 | 0.6364 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1427 |
+
| 2.3543 | 417 | 0.4763 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1428 |
+
| 2.3712 | 420 | 0.583 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1429 |
+
| 2.3881 | 423 | 0.5912 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1430 |
+
| 2.4051 | 426 | 0.5936 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1431 |
+
| 2.4220 | 429 | 0.5959 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1432 |
+
| 2.4390 | 432 | 0.676 | 2.4265 | 0.9634 | 0.1220 | 0.0260 | 0.0362 | 0.4292 | 0.7433 | 0.6771 | 0.4752 | 0.1282 | 0.1304 | 1.6943 | 1.0 | 0.5532 | 0.8878 |
|
1433 |
+
| 2.4559 | 435 | 0.5622 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1434 |
+
| 2.4728 | 438 | 0.4633 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1435 |
+
| 2.4898 | 441 | 0.5955 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1436 |
+
| 2.5067 | 444 | 0.6271 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1437 |
+
| 2.5236 | 447 | 0.4988 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1438 |
+
| 2.5406 | 450 | 0.519 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1439 |
+
| 2.5575 | 453 | 0.5538 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1440 |
+
| 2.5745 | 456 | 0.4826 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1441 |
+
| 2.5914 | 459 | 0.6322 | 2.4541 | 0.9231 | 0.1224 | 0.0253 | 0.0345 | 0.4048 | 0.7595 | 0.6607 | 0.4713 | 0.1168 | 0.1323 | 1.7024 | 1.0 | 0.5557 | 0.8868 |
|
1442 |
+
| 2.6083 | 462 | 0.6342 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1443 |
+
| 2.6253 | 465 | 0.7012 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1444 |
+
| 2.6422 | 468 | 0.4175 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1445 |
+
| 2.6591 | 471 | 0.7575 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1446 |
+
| 2.6761 | 474 | 0.4687 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1447 |
+
| 2.6930 | 477 | 0.5907 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1448 |
+
| 2.7100 | 480 | 0.4796 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1449 |
+
| 2.7269 | 483 | 0.4809 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1450 |
+
| 2.7438 | 486 | 0.4696 | 2.4899 | 0.9546 | 0.1169 | 0.0247 | 0.0343 | 0.4138 | 0.7444 | 0.6688 | 0.4838 | 0.1166 | 0.1279 | 1.6605 | 1.0 | 0.5527 | 0.8883 |
|
1451 |
+
| 2.7608 | 489 | 0.6588 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1452 |
+
| 2.7777 | 492 | 0.5675 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1453 |
+
| 2.7946 | 495 | 0.4007 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1454 |
+
| 2.8116 | 498 | 0.4476 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1455 |
+
| 2.8285 | 501 | 0.433 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1456 |
+
| 2.8454 | 504 | 0.4154 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1457 |
+
| 2.8624 | 507 | 0.5416 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1458 |
+
| 2.8793 | 510 | 0.4546 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1459 |
+
| 2.8963 | 513 | 0.3326 | 2.4924 | 0.9493 | 0.1071 | 0.0248 | 0.0344 | 0.4033 | 0.7376 | 0.6558 | 0.4478 | 0.1148 | 0.1219 | 1.6918 | 1.0 | 0.5534 | 0.8907 |
|
1460 |
+
| 2.9132 | 516 | 0.594 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1461 |
+
| 2.9301 | 519 | 0.4727 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1462 |
+
| 2.9471 | 522 | 0.4701 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1463 |
+
| 2.9640 | 525 | 0.4606 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1464 |
+
| 2.9809 | 528 | 0.5025 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1465 |
+
| 2.9979 | 531 | 0.4314 | 2.4532 | 0.9270 | 0.1131 | 0.0247 | 0.0344 | 0.3951 | 0.7123 | 0.6345 | 0.4383 | 0.1143 | 0.1159 | 1.7003 | 1.0 | 0.5539 | 0.8904 |
|
1466 |
+
| 0.0169 | 3 | 0.6012 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1467 |
+
| 0.0337 | 6 | 0.7573 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1468 |
+
| 0.0506 | 9 | 0.9212 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1469 |
+
| 0.0674 | 12 | 0.6117 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1470 |
+
| 0.0843 | 15 | 0.8545 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1471 |
+
| 0.1011 | 18 | 0.6515 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1472 |
+
| 0.1180 | 21 | 0.7159 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1473 |
+
| 0.1348 | 24 | 0.7019 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1474 |
+
| 0.1517 | 27 | 0.4411 | 2.4659 | 0.9318 | 0.1117 | 0.0249 | 0.0345 | 0.3955 | 0.7092 | 0.6506 | 0.4205 | 0.1150 | 0.1110 | 1.7311 | 1.0 | 0.5512 | 0.8906 |
|
1475 |
+
| 0.1685 | 30 | 0.5125 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1476 |
+
| 0.1854 | 33 | 0.6885 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1477 |
+
| 0.2022 | 36 | 0.6435 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1478 |
+
| 0.2191 | 39 | 0.753 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1479 |
+
| 0.2360 | 42 | 0.7427 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1480 |
+
| 0.2528 | 45 | 0.5083 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1481 |
+
| 0.2697 | 48 | 0.7454 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1482 |
+
| 0.2865 | 51 | 0.8356 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1483 |
+
| 0.3034 | 54 | 0.8864 | 2.4545 | 0.9158 | 0.1009 | 0.0252 | 0.0347 | 0.3809 | 0.7240 | 0.6208 | 0.4417 | 0.1117 | 0.1055 | 1.7278 | 1.0 | 0.5499 | 0.8877 |
|
1484 |
+
| 0.3202 | 57 | 0.6015 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1485 |
+
| 0.3371 | 60 | 0.9482 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1486 |
+
| 0.3539 | 63 | 0.5404 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1487 |
+
| 0.3708 | 66 | 0.805 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1488 |
+
| 0.3876 | 69 | 0.7184 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1489 |
+
| 0.4045 | 72 | 0.8708 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1490 |
+
| 0.4213 | 75 | 0.8327 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1491 |
+
| 0.4382 | 78 | 0.5025 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1492 |
+
| 0.4551 | 81 | 0.6517 | 2.3539 | 0.9324 | 0.0842 | 0.0244 | 0.0348 | 0.3454 | 0.7161 | 0.6094 | 0.4443 | 0.1182 | 0.1060 | 1.6492 | 1.0 | 0.5557 | 0.8904 |
|
1493 |
+
| 0.4719 | 84 | 0.5801 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1494 |
+
| 0.4888 | 87 | 0.791 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1495 |
+
| 0.5056 | 90 | 0.6042 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1496 |
+
| 0.5225 | 93 | 0.7559 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1497 |
+
| 0.5393 | 96 | 0.6258 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1498 |
+
| 0.5562 | 99 | 0.8853 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1499 |
+
| 0.5730 | 102 | 0.5947 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1500 |
+
| 0.5899 | 105 | 0.644 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1501 |
+
| 0.6067 | 108 | 0.5682 | 2.4271 | 0.9260 | 0.1041 | 0.0246 | 0.0336 | 0.3448 | 0.7514 | 0.6302 | 0.4307 | 0.1059 | 0.1083 | 1.6174 | 1.0 | 0.5569 | 0.8959 |
|
1502 |
+
| 0.6236 | 111 | 0.5974 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1503 |
+
| 0.6404 | 114 | 0.649 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1504 |
+
| 0.6573 | 117 | 0.6966 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1505 |
+
| 0.6742 | 120 | 0.542 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1506 |
+
| 0.6910 | 123 | 0.8583 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1507 |
+
| 0.7079 | 126 | 0.6416 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1508 |
+
| 0.7247 | 129 | 0.6273 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1509 |
+
| 0.7416 | 132 | 0.8621 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1510 |
+
| 0.7584 | 135 | 0.7221 | 2.3367 | 0.9275 | 0.0930 | 0.0246 | 0.0316 | 0.3425 | 0.7485 | 0.5840 | 0.4126 | 0.1094 | 0.1021 | 1.5713 | 1.0 | 0.5611 | 0.8965 |
|
1511 |
+
| 0.7753 | 138 | 0.9421 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1512 |
+
| 0.7921 | 141 | 0.6845 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1513 |
+
| 0.8090 | 144 | 0.5464 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1514 |
+
| 0.8258 | 147 | 0.6338 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1515 |
+
| 0.8427 | 150 | 0.4993 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1516 |
+
| 0.8596 | 153 | 0.6939 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1517 |
+
| 0.8764 | 156 | 0.5791 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1518 |
+
| 0.8933 | 159 | 0.9226 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1519 |
+
| 0.9101 | 162 | 0.6336 | 2.3761 | 0.9004 | 0.0762 | 0.0245 | 0.0321 | 0.3709 | 0.6995 | 0.5496 | 0.3908 | 0.1001 | 0.1031 | 1.6305 | 1.0 | 0.5603 | 0.8965 |
|
1520 |
+
| 0.9270 | 165 | 0.5395 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1521 |
+
| 0.9438 | 168 | 0.6874 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1522 |
+
| 0.9607 | 171 | 0.5614 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1523 |
+
| 0.9775 | 174 | 0.5812 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1524 |
+
| 0.9944 | 177 | 0.427 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1525 |
+
| 1.0112 | 180 | 0.4603 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1526 |
+
| 1.0281 | 183 | 0.6493 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1527 |
+
| 1.0449 | 186 | 0.6646 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1528 |
+
| 1.0618 | 189 | 0.7239 | 2.3752 | 0.8819 | 0.0660 | 0.0248 | 0.0331 | 0.3359 | 0.6889 | 0.5454 | 0.3691 | 0.1044 | 0.1008 | 1.5803 | 1.0 | 0.5602 | 0.8957 |
|
1529 |
+
| 1.0787 | 192 | 0.7593 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1530 |
+
| 1.0955 | 195 | 0.6877 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1531 |
+
| 1.1124 | 198 | 0.5482 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1532 |
+
| 1.1292 | 201 | 0.6047 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1533 |
+
| 1.1461 | 204 | 0.4358 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1534 |
+
| 1.1629 | 207 | 0.3343 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1535 |
+
| 1.1798 | 210 | 0.5624 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1536 |
+
| 1.1966 | 213 | 0.4578 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
|
1537 |
|
1538 |
+
</details>
|
1539 |
|
1540 |
### Framework Versions
|
1541 |
- Python: 3.10.13
|
checkpoint-214/optimizer.pt
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1130520122
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f888620033c3bd5e6338cc2de6a22da54b58eae0dbd8f5b734cb15d1c2905daf
|
3 |
size 1130520122
|
checkpoint-214/pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 565251810
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:729d8a3d31610e823553fa34bfea22348c6a7524dcb7e3cccf567499d4707072
|
3 |
size 565251810
|
checkpoint-214/rng_state.pth
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 14244
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1fe7816ba1779994fa8b784691e1f822a9d90f77ff053db585d38f5713da6d72
|
3 |
size 14244
|
checkpoint-214/scheduler.pt
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1064
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0a01939658d28c786324bde1d2fa56a76d998cefa50cd5d29dbb3e2118fc9722
|
3 |
size 1064
|
checkpoint-214/trainer_state.json
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
checkpoint-214/training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5688
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:72d6ebbf0ffc45e3199e7e67afe865d0f054853a38220ea09a039bd30fc6a761
|
3 |
size 5688
|