File size: 51,714 Bytes
2e3f432 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 |
09/23/2023 12:10:45 - WARNING - __main__ - Process rank: -1, device: cuda, n_gpu: 1, distributed training: False, 16-bits training: False
09/23/2023 12:11:04 - INFO - __main__ - Training/evaluation parameters Namespace(train_file='../../../data/mcqa/atomic/train_atm_n_2i_half_sample_name.jsonl', dev_file='../../../data/mcqa/atomic/dev_random_10k.jsonl', model_type='deberta-mlm', model_name_or_path='microsoft/deberta-v3-large', config_name='', tokenizer_name='', cache_dir='.cache', task_name='atomic', output_dir='output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6', second_train_file=None, second_dev_file=None, max_seq_length=128, max_words_to_mask=6, max_sequence_per_time=80, do_train=True, do_eval=True, do_ext_eval=True, evaluate_during_training=True, do_lower_case=False, per_gpu_train_batch_size=2, per_gpu_eval_batch_size=16, gradient_accumulation_steps=16, margin=1.0, learning_rate=5e-06, weight_decay=0.01, adam_epsilon=1e-06, max_grad_norm=1.0, num_train_epochs=1.0, max_steps=-1, warmup_steps=0, warmup_proportion=0.05, logging_steps=50, save_steps=500, logits_file='logits_test.txt', results_file='eval_results.txt', no_cuda=False, overwrite_output_dir=False, seed=42, fp16=False, fp16_opt_level='O1', local_rank=-1, server_ip='', server_port='', eval_output_dir='./eval_results', n_gpu=1, device=device(type='cuda'))
09/23/2023 12:11:13 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 12:11:13 - INFO - __main__ - Num examples = 10000
09/23/2023 12:11:13 - INFO - __main__ - Batch size = 16
09/23/2023 12:15:11 - INFO - __main__ - ***** Eval results *****
09/23/2023 12:15:11 - INFO - __main__ - acc = 0.3392
09/23/2023 12:25:13 - INFO - __main__ - warm up steps = 835
09/23/2023 12:25:13 - INFO - __main__ - ***** Running training *****
09/23/2023 12:25:13 - INFO - __main__ - Num examples = 534833
09/23/2023 12:25:13 - INFO - __main__ - Num Epochs = 1
09/23/2023 12:25:13 - INFO - __main__ - Instantaneous batch size per GPU = 2
09/23/2023 12:25:13 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 32
09/23/2023 12:25:13 - INFO - __main__ - Gradient Accumulation steps = 16
09/23/2023 12:25:13 - INFO - __main__ - Total optimization steps = 16713
09/23/2023 12:28:54 - INFO - __main__ - global_step = 50, average loss = 0.6903331369534135
09/23/2023 12:32:33 - INFO - __main__ - global_step = 100, average loss = 0.6819266405794769
09/23/2023 12:36:13 - INFO - __main__ - global_step = 150, average loss = 0.6690767159638926
09/23/2023 12:39:56 - INFO - __main__ - global_step = 200, average loss = 0.6476348407182377
09/23/2023 12:43:39 - INFO - __main__ - global_step = 250, average loss = 0.6220815655076877
09/23/2023 12:47:19 - INFO - __main__ - global_step = 300, average loss = 0.5299683179453859
09/23/2023 12:50:56 - INFO - __main__ - global_step = 350, average loss = 0.39345016410181416
09/23/2023 12:54:38 - INFO - __main__ - global_step = 400, average loss = 0.31127411118301096
09/23/2023 12:58:19 - INFO - __main__ - global_step = 450, average loss = 0.25150225180907
09/23/2023 13:02:00 - INFO - __main__ - global_step = 500, average loss = 0.22586858159028453
09/23/2023 13:02:01 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 13:02:01 - INFO - __main__ - Num examples = 10000
09/23/2023 13:02:01 - INFO - __main__ - Batch size = 16
09/23/2023 13:05:56 - INFO - __main__ - ***** Eval results *****
09/23/2023 13:05:56 - INFO - __main__ - acc = 0.6996
09/23/2023 13:06:23 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/23/2023 13:10:02 - INFO - __main__ - global_step = 550, average loss = 0.22251796642665794
09/23/2023 13:13:46 - INFO - __main__ - global_step = 600, average loss = 0.19366045010890956
09/23/2023 13:17:29 - INFO - __main__ - global_step = 650, average loss = 0.18587105088678071
09/23/2023 13:21:15 - INFO - __main__ - global_step = 700, average loss = 0.1760789550206391
09/23/2023 13:24:59 - INFO - __main__ - global_step = 750, average loss = 0.18312411408871412
09/23/2023 13:28:42 - INFO - __main__ - global_step = 800, average loss = 0.15576540186157217
09/23/2023 13:32:25 - INFO - __main__ - global_step = 850, average loss = 0.16302873345994157
09/23/2023 13:36:07 - INFO - __main__ - global_step = 900, average loss = 0.15725697406036487
09/23/2023 13:39:46 - INFO - __main__ - global_step = 950, average loss = 0.15640976145299645
09/23/2023 13:43:33 - INFO - __main__ - global_step = 1000, average loss = 0.15606625928507128
09/23/2023 13:43:34 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 13:43:34 - INFO - __main__ - Num examples = 10000
09/23/2023 13:43:34 - INFO - __main__ - Batch size = 16
09/23/2023 13:47:30 - INFO - __main__ - ***** Eval results *****
09/23/2023 13:47:30 - INFO - __main__ - acc = 0.7961
09/23/2023 13:47:58 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/23/2023 13:51:41 - INFO - __main__ - global_step = 1050, average loss = 0.14431810150181262
09/23/2023 13:55:20 - INFO - __main__ - global_step = 1100, average loss = 0.15233074207513708
09/23/2023 13:59:01 - INFO - __main__ - global_step = 1150, average loss = 0.1404175848151772
09/23/2023 14:02:44 - INFO - __main__ - global_step = 1200, average loss = 0.12134294869215864
09/23/2023 14:06:20 - INFO - __main__ - global_step = 1250, average loss = 0.1363200130731275
09/23/2023 14:09:59 - INFO - __main__ - global_step = 1300, average loss = 0.13769450530940958
09/23/2023 14:13:43 - INFO - __main__ - global_step = 1350, average loss = 0.12156560226379952
09/23/2023 14:17:18 - INFO - __main__ - global_step = 1400, average loss = 0.12623315585107775
09/23/2023 14:20:59 - INFO - __main__ - global_step = 1450, average loss = 0.14377202547417256
09/23/2023 14:24:33 - INFO - __main__ - global_step = 1500, average loss = 0.1286695548933858
09/23/2023 14:24:34 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 14:24:34 - INFO - __main__ - Num examples = 10000
09/23/2023 14:24:34 - INFO - __main__ - Batch size = 16
09/23/2023 14:28:29 - INFO - __main__ - ***** Eval results *****
09/23/2023 14:28:29 - INFO - __main__ - acc = 0.8048
09/23/2023 14:28:56 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/23/2023 14:32:42 - INFO - __main__ - global_step = 1550, average loss = 0.1198868363915244
09/23/2023 14:36:24 - INFO - __main__ - global_step = 1600, average loss = 0.12324378551486007
09/23/2023 14:40:00 - INFO - __main__ - global_step = 1650, average loss = 0.11938468464672042
09/23/2023 14:43:41 - INFO - __main__ - global_step = 1700, average loss = 0.14236379045556533
09/23/2023 14:47:22 - INFO - __main__ - global_step = 1750, average loss = 0.13320694023670512
09/23/2023 14:51:02 - INFO - __main__ - global_step = 1800, average loss = 0.13622453257718006
09/23/2023 14:54:42 - INFO - __main__ - global_step = 1850, average loss = 0.13987649206645072
09/23/2023 14:58:22 - INFO - __main__ - global_step = 1900, average loss = 0.12299754774277971
09/23/2023 15:02:05 - INFO - __main__ - global_step = 1950, average loss = 0.11868109124743569
09/23/2023 15:05:47 - INFO - __main__ - global_step = 2000, average loss = 0.1415042275990345
09/23/2023 15:05:47 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 15:05:47 - INFO - __main__ - Num examples = 10000
09/23/2023 15:05:47 - INFO - __main__ - Batch size = 16
09/23/2023 15:09:43 - INFO - __main__ - ***** Eval results *****
09/23/2023 15:09:43 - INFO - __main__ - acc = 0.8063
09/23/2023 15:10:10 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/23/2023 15:13:51 - INFO - __main__ - global_step = 2050, average loss = 0.11399275673671581
09/23/2023 15:17:31 - INFO - __main__ - global_step = 2100, average loss = 0.1065546132405143
09/23/2023 15:21:11 - INFO - __main__ - global_step = 2150, average loss = 0.12809142941467144
09/23/2023 15:24:51 - INFO - __main__ - global_step = 2200, average loss = 0.12454848410692648
09/23/2023 15:28:34 - INFO - __main__ - global_step = 2250, average loss = 0.10986286829065647
09/23/2023 15:32:14 - INFO - __main__ - global_step = 2300, average loss = 0.11237965747121052
09/23/2023 15:35:56 - INFO - __main__ - global_step = 2350, average loss = 0.10897610924319451
09/23/2023 15:39:41 - INFO - __main__ - global_step = 2400, average loss = 0.12056981857070241
09/23/2023 15:43:24 - INFO - __main__ - global_step = 2450, average loss = 0.13911059297635803
09/23/2023 15:47:10 - INFO - __main__ - global_step = 2500, average loss = 0.11335444856034883
09/23/2023 15:47:10 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 15:47:10 - INFO - __main__ - Num examples = 10000
09/23/2023 15:47:10 - INFO - __main__ - Batch size = 16
09/23/2023 15:51:06 - INFO - __main__ - ***** Eval results *****
09/23/2023 15:51:06 - INFO - __main__ - acc = 0.8234
09/23/2023 15:51:32 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/23/2023 15:55:10 - INFO - __main__ - global_step = 2550, average loss = 0.12103958850973867
09/23/2023 15:58:57 - INFO - __main__ - global_step = 2600, average loss = 0.11913071399074397
09/23/2023 16:02:38 - INFO - __main__ - global_step = 2650, average loss = 0.11255583499452769
09/23/2023 16:06:28 - INFO - __main__ - global_step = 2700, average loss = 0.1006322616293619
09/23/2023 16:10:12 - INFO - __main__ - global_step = 2750, average loss = 0.0932968783121487
09/23/2023 16:13:51 - INFO - __main__ - global_step = 2800, average loss = 0.11056979637924087
09/23/2023 16:17:38 - INFO - __main__ - global_step = 2850, average loss = 0.12318793082176853
09/23/2023 16:21:21 - INFO - __main__ - global_step = 2900, average loss = 0.10864610994302439
09/23/2023 16:25:03 - INFO - __main__ - global_step = 2950, average loss = 0.11261582636667299
09/23/2023 16:28:40 - INFO - __main__ - global_step = 3000, average loss = 0.12150005620278534
09/23/2023 16:28:40 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 16:28:40 - INFO - __main__ - Num examples = 10000
09/23/2023 16:28:40 - INFO - __main__ - Batch size = 16
09/23/2023 16:32:35 - INFO - __main__ - ***** Eval results *****
09/23/2023 16:32:35 - INFO - __main__ - acc = 0.8261
09/23/2023 16:33:02 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/23/2023 16:36:46 - INFO - __main__ - global_step = 3050, average loss = 0.10565035182957218
09/23/2023 16:40:30 - INFO - __main__ - global_step = 3100, average loss = 0.10429829731896462
09/23/2023 16:44:14 - INFO - __main__ - global_step = 3150, average loss = 0.10812272985053824
09/23/2023 16:47:54 - INFO - __main__ - global_step = 3200, average loss = 0.12238092143270478
09/23/2023 16:51:33 - INFO - __main__ - global_step = 3250, average loss = 0.10868940783606376
09/23/2023 16:55:14 - INFO - __main__ - global_step = 3300, average loss = 0.1209917226509424
09/23/2023 16:58:59 - INFO - __main__ - global_step = 3350, average loss = 0.1191260662042896
09/23/2023 17:02:41 - INFO - __main__ - global_step = 3400, average loss = 0.1174743126919202
09/23/2023 17:06:26 - INFO - __main__ - global_step = 3450, average loss = 0.100895225374843
09/23/2023 17:10:02 - INFO - __main__ - global_step = 3500, average loss = 0.0931866138278565
09/23/2023 17:10:03 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 17:10:03 - INFO - __main__ - Num examples = 10000
09/23/2023 17:10:03 - INFO - __main__ - Batch size = 16
09/23/2023 17:13:58 - INFO - __main__ - ***** Eval results *****
09/23/2023 17:13:58 - INFO - __main__ - acc = 0.8229
09/23/2023 17:17:45 - INFO - __main__ - global_step = 3550, average loss = 0.10633477224648231
09/23/2023 17:21:30 - INFO - __main__ - global_step = 3600, average loss = 0.1021722938354651
09/23/2023 17:25:11 - INFO - __main__ - global_step = 3650, average loss = 0.10295378862727375
09/23/2023 17:28:50 - INFO - __main__ - global_step = 3700, average loss = 0.1024187771679135
09/23/2023 17:32:34 - INFO - __main__ - global_step = 3750, average loss = 0.09922411829451448
09/23/2023 17:36:14 - INFO - __main__ - global_step = 3800, average loss = 0.11105157318372222
09/23/2023 17:39:57 - INFO - __main__ - global_step = 3850, average loss = 0.12378941989987652
09/23/2023 17:43:42 - INFO - __main__ - global_step = 3900, average loss = 0.1034327056143593
09/23/2023 17:47:25 - INFO - __main__ - global_step = 3950, average loss = 0.09697925167827634
09/23/2023 17:51:09 - INFO - __main__ - global_step = 4000, average loss = 0.11230336717126192
09/23/2023 17:51:09 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 17:51:09 - INFO - __main__ - Num examples = 10000
09/23/2023 17:51:09 - INFO - __main__ - Batch size = 16
09/23/2023 17:55:05 - INFO - __main__ - ***** Eval results *****
09/23/2023 17:55:05 - INFO - __main__ - acc = 0.8371
09/23/2023 17:55:32 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/23/2023 17:59:12 - INFO - __main__ - global_step = 4050, average loss = 0.10925351051962934
09/23/2023 18:03:00 - INFO - __main__ - global_step = 4100, average loss = 0.09795216493275802
09/23/2023 18:06:43 - INFO - __main__ - global_step = 4150, average loss = 0.09962472554965643
09/23/2023 18:10:25 - INFO - __main__ - global_step = 4200, average loss = 0.10342389734141762
09/23/2023 18:14:05 - INFO - __main__ - global_step = 4250, average loss = 0.09674815248567029
09/23/2023 18:17:48 - INFO - __main__ - global_step = 4300, average loss = 0.10319628210134396
09/23/2023 18:21:33 - INFO - __main__ - global_step = 4350, average loss = 0.09340641272166977
09/23/2023 18:25:14 - INFO - __main__ - global_step = 4400, average loss = 0.10845618240913608
09/23/2023 18:28:59 - INFO - __main__ - global_step = 4450, average loss = 0.11604906246473547
09/23/2023 18:32:43 - INFO - __main__ - global_step = 4500, average loss = 0.09590314964269055
09/23/2023 18:32:43 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 18:32:43 - INFO - __main__ - Num examples = 10000
09/23/2023 18:32:43 - INFO - __main__ - Batch size = 16
09/23/2023 18:36:38 - INFO - __main__ - ***** Eval results *****
09/23/2023 18:36:38 - INFO - __main__ - acc = 0.8305
09/23/2023 18:40:22 - INFO - __main__ - global_step = 4550, average loss = 0.09955280199857952
09/23/2023 18:44:07 - INFO - __main__ - global_step = 4600, average loss = 0.09018894311768236
09/23/2023 18:47:49 - INFO - __main__ - global_step = 4650, average loss = 0.11624654464081687
09/23/2023 18:51:30 - INFO - __main__ - global_step = 4700, average loss = 0.11213955332923434
09/23/2023 18:55:07 - INFO - __main__ - global_step = 4750, average loss = 0.11335175217776851
09/23/2023 18:58:47 - INFO - __main__ - global_step = 4800, average loss = 0.10374061681199237
09/23/2023 19:02:34 - INFO - __main__ - global_step = 4850, average loss = 0.09650620453016018
09/23/2023 19:06:16 - INFO - __main__ - global_step = 4900, average loss = 0.1034209698169434
09/23/2023 19:09:53 - INFO - __main__ - global_step = 4950, average loss = 0.10046588191311458
09/23/2023 19:13:34 - INFO - __main__ - global_step = 5000, average loss = 0.10752027794980677
09/23/2023 19:13:34 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 19:13:34 - INFO - __main__ - Num examples = 10000
09/23/2023 19:13:34 - INFO - __main__ - Batch size = 16
09/23/2023 19:17:29 - INFO - __main__ - ***** Eval results *****
09/23/2023 19:17:29 - INFO - __main__ - acc = 0.8355
09/23/2023 19:21:19 - INFO - __main__ - global_step = 5050, average loss = 0.10195030277842307
09/23/2023 19:24:58 - INFO - __main__ - global_step = 5100, average loss = 0.10987481483532065
09/23/2023 19:28:41 - INFO - __main__ - global_step = 5150, average loss = 0.10906005093554995
09/23/2023 19:32:23 - INFO - __main__ - global_step = 5200, average loss = 0.09835696181547973
09/23/2023 19:36:06 - INFO - __main__ - global_step = 5250, average loss = 0.10181126694624254
09/23/2023 19:39:52 - INFO - __main__ - global_step = 5300, average loss = 0.08663028705283068
09/23/2023 19:43:30 - INFO - __main__ - global_step = 5350, average loss = 0.10507196654667496
09/23/2023 19:47:18 - INFO - __main__ - global_step = 5400, average loss = 0.108608085659871
09/23/2023 19:51:03 - INFO - __main__ - global_step = 5450, average loss = 0.099619501844536
09/23/2023 19:54:49 - INFO - __main__ - global_step = 5500, average loss = 0.10225338533447939
09/23/2023 19:54:49 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 19:54:49 - INFO - __main__ - Num examples = 10000
09/23/2023 19:54:49 - INFO - __main__ - Batch size = 16
09/23/2023 19:58:45 - INFO - __main__ - ***** Eval results *****
09/23/2023 19:58:45 - INFO - __main__ - acc = 0.8279
09/23/2023 20:02:26 - INFO - __main__ - global_step = 5550, average loss = 0.10436682683890468
09/23/2023 20:06:11 - INFO - __main__ - global_step = 5600, average loss = 0.10477761221260153
09/23/2023 20:09:52 - INFO - __main__ - global_step = 5650, average loss = 0.09326410317778937
09/23/2023 20:13:31 - INFO - __main__ - global_step = 5700, average loss = 0.11269167278223904
09/23/2023 20:17:16 - INFO - __main__ - global_step = 5750, average loss = 0.10188864256499074
09/23/2023 20:21:00 - INFO - __main__ - global_step = 5800, average loss = 0.10433580860199981
09/23/2023 20:24:43 - INFO - __main__ - global_step = 5850, average loss = 0.08972063858884212
09/23/2023 20:28:22 - INFO - __main__ - global_step = 5900, average loss = 0.1065664726671821
09/23/2023 20:32:07 - INFO - __main__ - global_step = 5950, average loss = 0.10174332244623656
09/23/2023 20:35:49 - INFO - __main__ - global_step = 6000, average loss = 0.08872646622621687
09/23/2023 20:35:49 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 20:35:49 - INFO - __main__ - Num examples = 10000
09/23/2023 20:35:49 - INFO - __main__ - Batch size = 16
09/23/2023 20:39:45 - INFO - __main__ - ***** Eval results *****
09/23/2023 20:39:45 - INFO - __main__ - acc = 0.8363
09/23/2023 20:43:29 - INFO - __main__ - global_step = 6050, average loss = 0.10705330887685705
09/23/2023 20:47:16 - INFO - __main__ - global_step = 6100, average loss = 0.09171272950654384
09/23/2023 20:50:59 - INFO - __main__ - global_step = 6150, average loss = 0.0861645900901567
09/23/2023 20:54:46 - INFO - __main__ - global_step = 6200, average loss = 0.08994678908144124
09/23/2023 20:58:32 - INFO - __main__ - global_step = 6250, average loss = 0.08786970607354305
09/23/2023 21:02:13 - INFO - __main__ - global_step = 6300, average loss = 0.09656520821336016
09/23/2023 21:05:56 - INFO - __main__ - global_step = 6350, average loss = 0.09620310332989902
09/23/2023 21:09:42 - INFO - __main__ - global_step = 6400, average loss = 0.09152124080545036
09/23/2023 21:13:22 - INFO - __main__ - global_step = 6450, average loss = 0.09472263304131047
09/23/2023 21:17:06 - INFO - __main__ - global_step = 6500, average loss = 0.10554198697194807
09/23/2023 21:17:06 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 21:17:06 - INFO - __main__ - Num examples = 10000
09/23/2023 21:17:06 - INFO - __main__ - Batch size = 16
09/23/2023 21:21:01 - INFO - __main__ - ***** Eval results *****
09/23/2023 21:21:01 - INFO - __main__ - acc = 0.841
09/23/2023 21:21:28 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/23/2023 21:25:14 - INFO - __main__ - global_step = 6550, average loss = 0.09830655160796596
09/23/2023 21:28:55 - INFO - __main__ - global_step = 6600, average loss = 0.09539545015402837
09/23/2023 21:32:40 - INFO - __main__ - global_step = 6650, average loss = 0.09118585625503328
09/23/2023 21:36:18 - INFO - __main__ - global_step = 6700, average loss = 0.09700520555491493
09/23/2023 21:40:03 - INFO - __main__ - global_step = 6750, average loss = 0.105271778342576
09/23/2023 21:43:45 - INFO - __main__ - global_step = 6800, average loss = 0.10975144471223758
09/23/2023 21:47:28 - INFO - __main__ - global_step = 6850, average loss = 0.09920243133579788
09/23/2023 21:51:11 - INFO - __main__ - global_step = 6900, average loss = 0.09791661702009151
09/23/2023 21:54:51 - INFO - __main__ - global_step = 6950, average loss = 0.08630025177910283
09/23/2023 21:58:29 - INFO - __main__ - global_step = 7000, average loss = 0.09660528897402401
09/23/2023 21:58:29 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 21:58:29 - INFO - __main__ - Num examples = 10000
09/23/2023 21:58:29 - INFO - __main__ - Batch size = 16
09/23/2023 22:02:25 - INFO - __main__ - ***** Eval results *****
09/23/2023 22:02:25 - INFO - __main__ - acc = 0.843
09/23/2023 22:02:51 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/23/2023 22:06:33 - INFO - __main__ - global_step = 7050, average loss = 0.10305566756385814
09/23/2023 22:10:07 - INFO - __main__ - global_step = 7100, average loss = 0.10687436608219286
09/23/2023 22:13:47 - INFO - __main__ - global_step = 7150, average loss = 0.0946133067667688
09/23/2023 22:17:27 - INFO - __main__ - global_step = 7200, average loss = 0.09795189084834419
09/23/2023 22:21:17 - INFO - __main__ - global_step = 7250, average loss = 0.09060888570308634
09/23/2023 22:24:59 - INFO - __main__ - global_step = 7300, average loss = 0.0877145413684775
09/23/2023 22:28:35 - INFO - __main__ - global_step = 7350, average loss = 0.10495714643941029
09/23/2023 22:32:21 - INFO - __main__ - global_step = 7400, average loss = 0.07401456630654138
09/23/2023 22:36:03 - INFO - __main__ - global_step = 7450, average loss = 0.09523518772701209
09/23/2023 22:39:41 - INFO - __main__ - global_step = 7500, average loss = 0.10137952610446518
09/23/2023 22:39:41 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 22:39:41 - INFO - __main__ - Num examples = 10000
09/23/2023 22:39:41 - INFO - __main__ - Batch size = 16
09/23/2023 22:43:37 - INFO - __main__ - ***** Eval results *****
09/23/2023 22:43:37 - INFO - __main__ - acc = 0.846
09/23/2023 22:44:03 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/23/2023 22:47:46 - INFO - __main__ - global_step = 7550, average loss = 0.09563293447645264
09/23/2023 22:51:31 - INFO - __main__ - global_step = 7600, average loss = 0.09618103489105125
09/23/2023 22:55:13 - INFO - __main__ - global_step = 7650, average loss = 0.08849806944810552
09/23/2023 22:58:54 - INFO - __main__ - global_step = 7700, average loss = 0.10007433392238455
09/23/2023 23:02:36 - INFO - __main__ - global_step = 7750, average loss = 0.09035434001329122
09/23/2023 23:06:24 - INFO - __main__ - global_step = 7800, average loss = 0.09338357288788757
09/23/2023 23:10:04 - INFO - __main__ - global_step = 7850, average loss = 0.09912064949181514
09/23/2023 23:13:47 - INFO - __main__ - global_step = 7900, average loss = 0.08827902228244057
09/23/2023 23:17:27 - INFO - __main__ - global_step = 7950, average loss = 0.11218067690118914
09/23/2023 23:21:09 - INFO - __main__ - global_step = 8000, average loss = 0.08588292430682486
09/23/2023 23:21:09 - INFO - __main__ - ***** Running evaluation *****
09/23/2023 23:21:09 - INFO - __main__ - Num examples = 10000
09/23/2023 23:21:09 - INFO - __main__ - Batch size = 16
09/23/2023 23:25:05 - INFO - __main__ - ***** Eval results *****
09/23/2023 23:25:05 - INFO - __main__ - acc = 0.8472
09/23/2023 23:25:31 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/23/2023 23:29:08 - INFO - __main__ - global_step = 8050, average loss = 0.09245043838061974
09/23/2023 23:32:54 - INFO - __main__ - global_step = 8100, average loss = 0.08283289226481429
09/23/2023 23:36:34 - INFO - __main__ - global_step = 8150, average loss = 0.08407623038449856
09/23/2023 23:40:17 - INFO - __main__ - global_step = 8200, average loss = 0.09736820162237564
09/23/2023 23:44:06 - INFO - __main__ - global_step = 8250, average loss = 0.08463705457368632
09/23/2023 23:47:50 - INFO - __main__ - global_step = 8300, average loss = 0.10010304888644896
09/23/2023 23:51:35 - INFO - __main__ - global_step = 8350, average loss = 0.09222401980725409
09/23/2023 23:55:17 - INFO - __main__ - global_step = 8400, average loss = 0.08634746881416504
09/23/2023 23:58:59 - INFO - __main__ - global_step = 8450, average loss = 0.08723288500368653
09/24/2023 00:02:37 - INFO - __main__ - global_step = 8500, average loss = 0.10130320921433394
09/24/2023 00:02:37 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 00:02:37 - INFO - __main__ - Num examples = 10000
09/24/2023 00:02:37 - INFO - __main__ - Batch size = 16
09/24/2023 00:06:32 - INFO - __main__ - ***** Eval results *****
09/24/2023 00:06:32 - INFO - __main__ - acc = 0.8452
09/24/2023 00:10:13 - INFO - __main__ - global_step = 8550, average loss = 0.0889340414837352
09/24/2023 00:13:53 - INFO - __main__ - global_step = 8600, average loss = 0.0960574367789377
09/24/2023 00:17:37 - INFO - __main__ - global_step = 8650, average loss = 0.07860265792332939
09/24/2023 00:21:20 - INFO - __main__ - global_step = 8700, average loss = 0.09233207383847912
09/24/2023 00:25:05 - INFO - __main__ - global_step = 8750, average loss = 0.09803196908305836
09/24/2023 00:28:44 - INFO - __main__ - global_step = 8800, average loss = 0.08913468146740343
09/24/2023 00:32:26 - INFO - __main__ - global_step = 8850, average loss = 0.0880054514182666
09/24/2023 00:36:11 - INFO - __main__ - global_step = 8900, average loss = 0.0839999437017832
09/24/2023 00:39:52 - INFO - __main__ - global_step = 8950, average loss = 0.10094311676693905
09/24/2023 00:43:32 - INFO - __main__ - global_step = 9000, average loss = 0.10011614485312748
09/24/2023 00:43:32 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 00:43:32 - INFO - __main__ - Num examples = 10000
09/24/2023 00:43:32 - INFO - __main__ - Batch size = 16
09/24/2023 00:47:27 - INFO - __main__ - ***** Eval results *****
09/24/2023 00:47:27 - INFO - __main__ - acc = 0.8463
09/24/2023 00:51:10 - INFO - __main__ - global_step = 9050, average loss = 0.09407024829903093
09/24/2023 00:54:48 - INFO - __main__ - global_step = 9100, average loss = 0.09510339217069032
09/24/2023 00:58:27 - INFO - __main__ - global_step = 9150, average loss = 0.09413513723055075
09/24/2023 01:02:10 - INFO - __main__ - global_step = 9200, average loss = 0.08488880819528276
09/24/2023 01:05:47 - INFO - __main__ - global_step = 9250, average loss = 0.09847264970565447
09/24/2023 01:09:28 - INFO - __main__ - global_step = 9300, average loss = 0.08640140883806452
09/24/2023 01:13:08 - INFO - __main__ - global_step = 9350, average loss = 0.07884123000112594
09/24/2023 01:16:54 - INFO - __main__ - global_step = 9400, average loss = 0.0831154512307694
09/24/2023 01:20:32 - INFO - __main__ - global_step = 9450, average loss = 0.09913980022422038
09/24/2023 01:24:11 - INFO - __main__ - global_step = 9500, average loss = 0.09805536182444484
09/24/2023 01:24:11 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 01:24:11 - INFO - __main__ - Num examples = 10000
09/24/2023 01:24:11 - INFO - __main__ - Batch size = 16
09/24/2023 01:28:07 - INFO - __main__ - ***** Eval results *****
09/24/2023 01:28:07 - INFO - __main__ - acc = 0.8463
09/24/2023 01:31:55 - INFO - __main__ - global_step = 9550, average loss = 0.0912455873134968
09/24/2023 01:35:38 - INFO - __main__ - global_step = 9600, average loss = 0.10278063782119716
09/24/2023 01:39:12 - INFO - __main__ - global_step = 9650, average loss = 0.08788584528032516
09/24/2023 01:42:53 - INFO - __main__ - global_step = 9700, average loss = 0.08058010207216285
09/24/2023 01:46:34 - INFO - __main__ - global_step = 9750, average loss = 0.08765123128723644
09/24/2023 01:50:14 - INFO - __main__ - global_step = 9800, average loss = 0.09005017607181799
09/24/2023 01:54:03 - INFO - __main__ - global_step = 9850, average loss = 0.07892634223760979
09/24/2023 01:57:44 - INFO - __main__ - global_step = 9900, average loss = 0.07999062808303278
09/24/2023 02:01:26 - INFO - __main__ - global_step = 9950, average loss = 0.09494447313452838
09/24/2023 02:05:06 - INFO - __main__ - global_step = 10000, average loss = 0.0841888710015337
09/24/2023 02:05:06 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 02:05:06 - INFO - __main__ - Num examples = 10000
09/24/2023 02:05:06 - INFO - __main__ - Batch size = 16
09/24/2023 02:09:01 - INFO - __main__ - ***** Eval results *****
09/24/2023 02:09:01 - INFO - __main__ - acc = 0.8471
09/24/2023 02:12:40 - INFO - __main__ - global_step = 10050, average loss = 0.08929907138342968
09/24/2023 02:16:20 - INFO - __main__ - global_step = 10100, average loss = 0.10172551687661326
09/24/2023 02:20:00 - INFO - __main__ - global_step = 10150, average loss = 0.09577305402533966
09/24/2023 02:23:46 - INFO - __main__ - global_step = 10200, average loss = 0.09480085656211486
09/24/2023 02:27:27 - INFO - __main__ - global_step = 10250, average loss = 0.07956519629078684
09/24/2023 02:31:05 - INFO - __main__ - global_step = 10300, average loss = 0.08291967767250753
09/24/2023 02:34:47 - INFO - __main__ - global_step = 10350, average loss = 0.09592102762369904
09/24/2023 02:38:29 - INFO - __main__ - global_step = 10400, average loss = 0.08570889301292482
09/24/2023 02:42:13 - INFO - __main__ - global_step = 10450, average loss = 0.07362440132081247
09/24/2023 02:45:58 - INFO - __main__ - global_step = 10500, average loss = 0.08574875552483718
09/24/2023 02:45:58 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 02:45:58 - INFO - __main__ - Num examples = 10000
09/24/2023 02:45:58 - INFO - __main__ - Batch size = 16
09/24/2023 02:49:53 - INFO - __main__ - ***** Eval results *****
09/24/2023 02:49:53 - INFO - __main__ - acc = 0.8524
09/24/2023 02:50:20 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/24/2023 02:54:03 - INFO - __main__ - global_step = 10550, average loss = 0.08846153970320302
09/24/2023 02:57:43 - INFO - __main__ - global_step = 10600, average loss = 0.08381684645668429
09/24/2023 03:01:26 - INFO - __main__ - global_step = 10650, average loss = 0.09288432469184045
09/24/2023 03:05:08 - INFO - __main__ - global_step = 10700, average loss = 0.08199916316298186
09/24/2023 03:08:56 - INFO - __main__ - global_step = 10750, average loss = 0.09068042659768252
09/24/2023 03:12:37 - INFO - __main__ - global_step = 10800, average loss = 0.08719110449641448
09/24/2023 03:16:20 - INFO - __main__ - global_step = 10850, average loss = 0.09036207084544003
09/24/2023 03:20:04 - INFO - __main__ - global_step = 10900, average loss = 0.095746248819637
09/24/2023 03:23:45 - INFO - __main__ - global_step = 10950, average loss = 0.1019882604497252
09/24/2023 03:27:25 - INFO - __main__ - global_step = 11000, average loss = 0.08660416512644588
09/24/2023 03:27:25 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 03:27:25 - INFO - __main__ - Num examples = 10000
09/24/2023 03:27:25 - INFO - __main__ - Batch size = 16
09/24/2023 03:31:21 - INFO - __main__ - ***** Eval results *****
09/24/2023 03:31:21 - INFO - __main__ - acc = 0.8521
09/24/2023 03:35:00 - INFO - __main__ - global_step = 11050, average loss = 0.07959849048202158
09/24/2023 03:38:42 - INFO - __main__ - global_step = 11100, average loss = 0.08480279741248524
09/24/2023 03:42:25 - INFO - __main__ - global_step = 11150, average loss = 0.07940411141982623
09/24/2023 03:46:06 - INFO - __main__ - global_step = 11200, average loss = 0.08627346496621613
09/24/2023 03:49:48 - INFO - __main__ - global_step = 11250, average loss = 0.08515130840663915
09/24/2023 03:53:28 - INFO - __main__ - global_step = 11300, average loss = 0.08047833000106039
09/24/2023 03:57:07 - INFO - __main__ - global_step = 11350, average loss = 0.08884227124826338
09/24/2023 04:00:47 - INFO - __main__ - global_step = 11400, average loss = 0.09542614945773494
09/24/2023 04:04:26 - INFO - __main__ - global_step = 11450, average loss = 0.08332637125422479
09/24/2023 04:08:07 - INFO - __main__ - global_step = 11500, average loss = 0.09769482501476887
09/24/2023 04:08:07 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 04:08:07 - INFO - __main__ - Num examples = 10000
09/24/2023 04:08:07 - INFO - __main__ - Batch size = 16
09/24/2023 04:12:02 - INFO - __main__ - ***** Eval results *****
09/24/2023 04:12:02 - INFO - __main__ - acc = 0.851
09/24/2023 04:15:51 - INFO - __main__ - global_step = 11550, average loss = 0.09137944790694746
09/24/2023 04:19:38 - INFO - __main__ - global_step = 11600, average loss = 0.07454582622590351
09/24/2023 04:23:20 - INFO - __main__ - global_step = 11650, average loss = 0.08284565404814202
09/24/2023 04:26:59 - INFO - __main__ - global_step = 11700, average loss = 0.0969824349215196
09/24/2023 04:30:41 - INFO - __main__ - global_step = 11750, average loss = 0.09389037321489013
09/24/2023 04:34:23 - INFO - __main__ - global_step = 11800, average loss = 0.08608788483528769
09/24/2023 04:38:05 - INFO - __main__ - global_step = 11850, average loss = 0.09322659247220144
09/24/2023 04:41:49 - INFO - __main__ - global_step = 11900, average loss = 0.09286965438863262
09/24/2023 04:45:31 - INFO - __main__ - global_step = 11950, average loss = 0.08214385434631367
09/24/2023 04:49:12 - INFO - __main__ - global_step = 12000, average loss = 0.09392224536069989
09/24/2023 04:49:12 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 04:49:12 - INFO - __main__ - Num examples = 10000
09/24/2023 04:49:12 - INFO - __main__ - Batch size = 16
09/24/2023 04:53:07 - INFO - __main__ - ***** Eval results *****
09/24/2023 04:53:07 - INFO - __main__ - acc = 0.8514
09/24/2023 04:56:53 - INFO - __main__ - global_step = 12050, average loss = 0.08019034011129406
09/24/2023 05:00:34 - INFO - __main__ - global_step = 12100, average loss = 0.08210711618239656
09/24/2023 05:04:16 - INFO - __main__ - global_step = 12150, average loss = 0.08764273267355747
09/24/2023 05:08:02 - INFO - __main__ - global_step = 12200, average loss = 0.08758470895321807
09/24/2023 05:11:48 - INFO - __main__ - global_step = 12250, average loss = 0.07766548367973883
09/24/2023 05:15:27 - INFO - __main__ - global_step = 12300, average loss = 0.08148344823415755
09/24/2023 05:19:08 - INFO - __main__ - global_step = 12350, average loss = 0.08814196670609817
09/24/2023 05:22:50 - INFO - __main__ - global_step = 12400, average loss = 0.08936668847491092
09/24/2023 05:26:29 - INFO - __main__ - global_step = 12450, average loss = 0.08240065188347216
09/24/2023 05:30:12 - INFO - __main__ - global_step = 12500, average loss = 0.08683115135392655
09/24/2023 05:30:12 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 05:30:12 - INFO - __main__ - Num examples = 10000
09/24/2023 05:30:12 - INFO - __main__ - Batch size = 16
09/24/2023 05:34:07 - INFO - __main__ - ***** Eval results *****
09/24/2023 05:34:07 - INFO - __main__ - acc = 0.8515
09/24/2023 05:37:53 - INFO - __main__ - global_step = 12550, average loss = 0.08871277472944712
09/24/2023 05:41:34 - INFO - __main__ - global_step = 12600, average loss = 0.08797626828309149
09/24/2023 05:45:11 - INFO - __main__ - global_step = 12650, average loss = 0.10095825259459616
09/24/2023 05:48:58 - INFO - __main__ - global_step = 12700, average loss = 0.07953012495926487
09/24/2023 05:52:41 - INFO - __main__ - global_step = 12750, average loss = 0.08843418272979761
09/24/2023 05:56:19 - INFO - __main__ - global_step = 12800, average loss = 0.07413991435227217
09/24/2023 05:59:59 - INFO - __main__ - global_step = 12850, average loss = 0.07519575585451094
09/24/2023 06:03:48 - INFO - __main__ - global_step = 12900, average loss = 0.08996981896292709
09/24/2023 06:07:28 - INFO - __main__ - global_step = 12950, average loss = 0.08996171029284597
09/24/2023 06:11:11 - INFO - __main__ - global_step = 13000, average loss = 0.08077499923689174
09/24/2023 06:11:11 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 06:11:11 - INFO - __main__ - Num examples = 10000
09/24/2023 06:11:11 - INFO - __main__ - Batch size = 16
09/24/2023 06:15:06 - INFO - __main__ - ***** Eval results *****
09/24/2023 06:15:06 - INFO - __main__ - acc = 0.8527
09/24/2023 06:15:33 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/24/2023 06:19:13 - INFO - __main__ - global_step = 13050, average loss = 0.08447560470420284
09/24/2023 06:22:54 - INFO - __main__ - global_step = 13100, average loss = 0.08299598100831646
09/24/2023 06:26:32 - INFO - __main__ - global_step = 13150, average loss = 0.08393764879734135
09/24/2023 06:30:08 - INFO - __main__ - global_step = 13200, average loss = 0.09848508099505125
09/24/2023 06:33:47 - INFO - __main__ - global_step = 13250, average loss = 0.09162080157435412
09/24/2023 06:37:28 - INFO - __main__ - global_step = 13300, average loss = 0.0914362099875143
09/24/2023 06:41:09 - INFO - __main__ - global_step = 13350, average loss = 0.07781068138462616
09/24/2023 06:44:55 - INFO - __main__ - global_step = 13400, average loss = 0.08868030074576382
09/24/2023 06:48:36 - INFO - __main__ - global_step = 13450, average loss = 0.08357623873533157
09/24/2023 06:52:18 - INFO - __main__ - global_step = 13500, average loss = 0.08828085365807055
09/24/2023 06:52:18 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 06:52:18 - INFO - __main__ - Num examples = 10000
09/24/2023 06:52:18 - INFO - __main__ - Batch size = 16
09/24/2023 06:56:14 - INFO - __main__ - ***** Eval results *****
09/24/2023 06:56:14 - INFO - __main__ - acc = 0.8499
09/24/2023 06:59:57 - INFO - __main__ - global_step = 13550, average loss = 0.08140521681067185
09/24/2023 07:03:37 - INFO - __main__ - global_step = 13600, average loss = 0.08341409597109305
09/24/2023 07:07:17 - INFO - __main__ - global_step = 13650, average loss = 0.08142950747031136
09/24/2023 07:10:56 - INFO - __main__ - global_step = 13700, average loss = 0.09089667504686076
09/24/2023 07:14:45 - INFO - __main__ - global_step = 13750, average loss = 0.07177684095106088
09/24/2023 07:18:24 - INFO - __main__ - global_step = 13800, average loss = 0.08592368463818274
09/24/2023 07:22:01 - INFO - __main__ - global_step = 13850, average loss = 0.08120634569131653
09/24/2023 07:25:48 - INFO - __main__ - global_step = 13900, average loss = 0.08909589071197843
09/24/2023 07:29:30 - INFO - __main__ - global_step = 13950, average loss = 0.08629100337015189
09/24/2023 07:33:10 - INFO - __main__ - global_step = 14000, average loss = 0.07722124511306902
09/24/2023 07:33:10 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 07:33:10 - INFO - __main__ - Num examples = 10000
09/24/2023 07:33:10 - INFO - __main__ - Batch size = 16
09/24/2023 07:37:05 - INFO - __main__ - ***** Eval results *****
09/24/2023 07:37:05 - INFO - __main__ - acc = 0.8533
09/24/2023 07:37:32 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/24/2023 07:41:11 - INFO - __main__ - global_step = 14050, average loss = 0.08182521525057382
09/24/2023 07:44:48 - INFO - __main__ - global_step = 14100, average loss = 0.0902410151962249
09/24/2023 07:48:28 - INFO - __main__ - global_step = 14150, average loss = 0.07409664937826164
09/24/2023 07:52:12 - INFO - __main__ - global_step = 14200, average loss = 0.08879891355274594
09/24/2023 07:55:53 - INFO - __main__ - global_step = 14250, average loss = 0.09268313445325475
09/24/2023 07:59:30 - INFO - __main__ - global_step = 14300, average loss = 0.08798344542199629
09/24/2023 08:03:13 - INFO - __main__ - global_step = 14350, average loss = 0.09607475698139752
09/24/2023 08:06:59 - INFO - __main__ - global_step = 14400, average loss = 0.07222031111843535
09/24/2023 08:10:40 - INFO - __main__ - global_step = 14450, average loss = 0.07480319764195884
09/24/2023 08:14:19 - INFO - __main__ - global_step = 14500, average loss = 0.0838716509303049
09/24/2023 08:14:19 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 08:14:19 - INFO - __main__ - Num examples = 10000
09/24/2023 08:14:19 - INFO - __main__ - Batch size = 16
09/24/2023 08:18:16 - INFO - __main__ - ***** Eval results *****
09/24/2023 08:18:16 - INFO - __main__ - acc = 0.8542
09/24/2023 08:18:42 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/24/2023 08:22:18 - INFO - __main__ - global_step = 14550, average loss = 0.08034001361316769
09/24/2023 08:25:55 - INFO - __main__ - global_step = 14600, average loss = 0.07689567271547276
09/24/2023 08:29:37 - INFO - __main__ - global_step = 14650, average loss = 0.09093381941405823
09/24/2023 08:33:25 - INFO - __main__ - global_step = 14700, average loss = 0.07569706412876258
09/24/2023 08:37:04 - INFO - __main__ - global_step = 14750, average loss = 0.07479940189456101
09/24/2023 08:40:47 - INFO - __main__ - global_step = 14800, average loss = 0.08522207450543647
09/24/2023 08:44:34 - INFO - __main__ - global_step = 14850, average loss = 0.0889268495763099
09/24/2023 08:48:16 - INFO - __main__ - global_step = 14900, average loss = 0.08616152721479012
09/24/2023 08:51:56 - INFO - __main__ - global_step = 14950, average loss = 0.07867321850848384
09/24/2023 08:55:39 - INFO - __main__ - global_step = 15000, average loss = 0.08426695556714549
09/24/2023 08:55:39 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 08:55:39 - INFO - __main__ - Num examples = 10000
09/24/2023 08:55:39 - INFO - __main__ - Batch size = 16
09/24/2023 08:59:34 - INFO - __main__ - ***** Eval results *****
09/24/2023 08:59:34 - INFO - __main__ - acc = 0.8542
09/24/2023 09:03:12 - INFO - __main__ - global_step = 15050, average loss = 0.07868185437655484
09/24/2023 09:07:00 - INFO - __main__ - global_step = 15100, average loss = 0.08520105790423259
09/24/2023 09:10:42 - INFO - __main__ - global_step = 15150, average loss = 0.09536004922925713
09/24/2023 09:14:19 - INFO - __main__ - global_step = 15200, average loss = 0.08502999547665241
09/24/2023 09:17:58 - INFO - __main__ - global_step = 15250, average loss = 0.08957034896484402
09/24/2023 09:21:34 - INFO - __main__ - global_step = 15300, average loss = 0.07968287494033575
09/24/2023 09:25:14 - INFO - __main__ - global_step = 15350, average loss = 0.08545487473544199
09/24/2023 09:28:55 - INFO - __main__ - global_step = 15400, average loss = 0.08528959889241378
09/24/2023 09:32:38 - INFO - __main__ - global_step = 15450, average loss = 0.08095955706679887
09/24/2023 09:36:19 - INFO - __main__ - global_step = 15500, average loss = 0.08725373520917856
09/24/2023 09:36:19 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 09:36:19 - INFO - __main__ - Num examples = 10000
09/24/2023 09:36:19 - INFO - __main__ - Batch size = 16
09/24/2023 09:40:15 - INFO - __main__ - ***** Eval results *****
09/24/2023 09:40:15 - INFO - __main__ - acc = 0.8545
09/24/2023 09:40:42 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/24/2023 09:44:22 - INFO - __main__ - global_step = 15550, average loss = 0.0843266883040269
09/24/2023 09:48:03 - INFO - __main__ - global_step = 15600, average loss = 0.07855528741223679
09/24/2023 09:51:47 - INFO - __main__ - global_step = 15650, average loss = 0.09478737017554523
09/24/2023 09:55:32 - INFO - __main__ - global_step = 15700, average loss = 0.08910313490487169
09/24/2023 09:59:16 - INFO - __main__ - global_step = 15750, average loss = 0.07736712342710234
09/24/2023 10:02:53 - INFO - __main__ - global_step = 15800, average loss = 0.08501649839432503
09/24/2023 10:06:37 - INFO - __main__ - global_step = 15850, average loss = 0.08495221398276044
09/24/2023 10:10:23 - INFO - __main__ - global_step = 15900, average loss = 0.08510145512744202
09/24/2023 10:14:07 - INFO - __main__ - global_step = 15950, average loss = 0.08335533107921947
09/24/2023 10:17:49 - INFO - __main__ - global_step = 16000, average loss = 0.09103241352764599
09/24/2023 10:17:49 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 10:17:49 - INFO - __main__ - Num examples = 10000
09/24/2023 10:17:49 - INFO - __main__ - Batch size = 16
09/24/2023 10:21:45 - INFO - __main__ - ***** Eval results *****
09/24/2023 10:21:45 - INFO - __main__ - acc = 0.8549
09/24/2023 10:22:12 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/24/2023 10:25:53 - INFO - __main__ - global_step = 16050, average loss = 0.0808029190406296
09/24/2023 10:29:33 - INFO - __main__ - global_step = 16100, average loss = 0.0950222506766113
09/24/2023 10:33:15 - INFO - __main__ - global_step = 16150, average loss = 0.08560644885961664
09/24/2023 10:36:53 - INFO - __main__ - global_step = 16200, average loss = 0.07925290400889935
09/24/2023 10:40:34 - INFO - __main__ - global_step = 16250, average loss = 0.08252620983123052
09/24/2023 10:44:15 - INFO - __main__ - global_step = 16300, average loss = 0.08747977073326182
09/24/2023 10:47:55 - INFO - __main__ - global_step = 16350, average loss = 0.08805208059333382
09/24/2023 10:51:41 - INFO - __main__ - global_step = 16400, average loss = 0.07935831163018064
09/24/2023 10:55:23 - INFO - __main__ - global_step = 16450, average loss = 0.0807358610859228
09/24/2023 10:59:03 - INFO - __main__ - global_step = 16500, average loss = 0.0775301494665473
09/24/2023 10:59:03 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 10:59:03 - INFO - __main__ - Num examples = 10000
09/24/2023 10:59:03 - INFO - __main__ - Batch size = 16
09/24/2023 11:02:59 - INFO - __main__ - ***** Eval results *****
09/24/2023 11:02:59 - INFO - __main__ - acc = 0.8532
09/24/2023 11:06:39 - INFO - __main__ - global_step = 16550, average loss = 0.06899339191091712
09/24/2023 11:10:25 - INFO - __main__ - global_step = 16600, average loss = 0.08612027997849508
09/24/2023 11:14:10 - INFO - __main__ - global_step = 16650, average loss = 0.08232147437905951
09/24/2023 11:17:50 - INFO - __main__ - global_step = 16700, average loss = 0.08530993062430753
09/24/2023 11:18:50 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 11:18:50 - INFO - __main__ - Num examples = 10000
09/24/2023 11:18:50 - INFO - __main__ - Batch size = 16
09/24/2023 11:22:45 - INFO - __main__ - ***** Eval results *****
09/24/2023 11:22:45 - INFO - __main__ - acc = 0.8533
09/24/2023 11:22:45 - INFO - __main__ - global_step = 16713, average loss = 0.11041826268834619
09/24/2023 11:23:18 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 11:23:18 - INFO - __main__ - Num examples = 10000
09/24/2023 11:23:18 - INFO - __main__ - Batch size = 16
09/24/2023 11:27:13 - INFO - __main__ - ***** Eval results *****
09/24/2023 11:27:13 - INFO - __main__ - acc = 0.8549
09/24/2023 11:27:16 - INFO - evaluate_DeBERTa - Namespace(dataset_file='../../../data/mcqa/eval/socialiqa_dev.jsonl', lm='output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6', out_dir='./eval_results/deberta-v3-large_2i_atm_half_sample_name_5e-6', device=0, reader='socialiqa', overwrite_output_dir=False, cache_dir=None)
09/24/2023 11:27:16 - INFO - evaluate_DeBERTa - Initializing output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/24/2023 11:34:38 - INFO - evaluate_DeBERTa - Namespace(dataset_file='../../../data/mcqa/eval/winogrande_dev.jsonl', lm='output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6', out_dir='./eval_results/deberta-v3-large_2i_atm_half_sample_name_5e-6', device=0, reader='winogrande', overwrite_output_dir=False, cache_dir=None)
09/24/2023 11:34:38 - INFO - evaluate_DeBERTa - Initializing output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/24/2023 11:37:05 - INFO - evaluate_DeBERTa - Namespace(dataset_file='../../../data/mcqa/eval/piqa_dev.jsonl', lm='output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6', out_dir='./eval_results/deberta-v3-large_2i_atm_half_sample_name_5e-6', device=0, reader='piqa', overwrite_output_dir=False, cache_dir=None)
09/24/2023 11:37:05 - INFO - evaluate_DeBERTa - Initializing output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/24/2023 11:43:59 - INFO - evaluate_DeBERTa - Namespace(dataset_file='../../../data/mcqa/eval/commonsenseqa_dev.jsonl', lm='output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6', out_dir='./eval_results/deberta-v3-large_2i_atm_half_sample_name_5e-6', device=0, reader='commonsenseqa', overwrite_output_dir=False, cache_dir=None)
09/24/2023 11:43:59 - INFO - evaluate_DeBERTa - Initializing output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/24/2023 11:49:43 - INFO - evaluate_DeBERTa - Namespace(dataset_file='../../../data/mcqa/eval/anli_dev.jsonl', lm='output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6', out_dir='./eval_results/deberta-v3-large_2i_atm_half_sample_name_5e-6', device=0, reader='anli', overwrite_output_dir=False, cache_dir=None)
09/24/2023 11:49:43 - INFO - evaluate_DeBERTa - Initializing output/Output_ATOMIC-pseudo-wWC/deberta-v3-large_2i_atm_half_sample_name_5e-6
09/24/2023 11:54:31 - INFO - __main__ - ***** Running evaluation *****
09/24/2023 11:54:31 - INFO - __main__ - Num examples = 120
09/24/2023 11:54:31 - INFO - __main__ - Batch size = 16
09/24/2023 11:54:47 - INFO - __main__ - ***** Eval results *****
09/24/2023 11:54:47 - INFO - __main__ - acc = 0.525
|