09/29/2023 23:17:38 - WARNING - __main__ - Process rank: -1, device: cuda, n_gpu: 1, distributed training: False, 16-bits training: False 09/29/2023 23:17:49 - INFO - __main__ - Training/evaluation parameters Namespace(train_file='../../../data/mcqa/atomic/train_atmc_2i_100k_name.jsonl', dev_file='../../../data/mcqa/atomic/dev_atmc_SyntheticQA_10k.jsonl', model_type='deberta-mlm', model_name_or_path='microsoft/deberta-v3-large', config_name='', tokenizer_name='', cache_dir='.cache', task_name='atomic', output_dir='output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6', second_train_file=None, second_dev_file=None, max_seq_length=128, max_words_to_mask=6, max_sequence_per_time=80, do_train=True, do_eval=True, do_ext_eval=True, evaluate_during_training=True, do_lower_case=False, per_gpu_train_batch_size=2, per_gpu_eval_batch_size=32, gradient_accumulation_steps=16, margin=1.0, learning_rate=5e-06, weight_decay=0.01, adam_epsilon=1e-06, max_grad_norm=1.0, num_train_epochs=1.0, max_steps=-1, warmup_steps=0, warmup_proportion=0.05, logging_steps=50, save_steps=200, logits_file='logits_test.txt', results_file='eval_results.txt', no_cuda=False, overwrite_output_dir=False, seed=101, fp16=False, fp16_opt_level='O1', local_rank=-1, server_ip='', server_port='', eval_output_dir='./eval_results', n_gpu=1, device=device(type='cuda')) 09/29/2023 23:17:58 - INFO - __main__ - ***** Running evaluation ***** 09/29/2023 23:17:58 - INFO - __main__ - Num examples = 10000 09/29/2023 23:17:58 - INFO - __main__ - Batch size = 32 09/29/2023 23:22:13 - INFO - __main__ - ***** Eval results ***** 09/29/2023 23:22:13 - INFO - __main__ - acc = 0.3356 09/29/2023 23:32:56 - INFO - __main__ - warm up steps = 916 09/29/2023 23:32:56 - INFO - __main__ - ***** Running training ***** 09/29/2023 23:32:56 - INFO - __main__ - Num examples = 586778 09/29/2023 23:32:56 - INFO - __main__ - Num Epochs = 1 09/29/2023 23:32:56 - INFO - __main__ - Instantaneous batch size per GPU = 2 09/29/2023 23:32:56 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 32 09/29/2023 23:32:56 - INFO - __main__ - Gradient Accumulation steps = 16 09/29/2023 23:32:56 - INFO - __main__ - Total optimization steps = 18336 09/29/2023 23:36:55 - INFO - __main__ - global_step = 50, average loss = 0.6978485188353807 09/29/2023 23:41:05 - INFO - __main__ - global_step = 100, average loss = 0.6761001783981919 09/29/2023 23:45:18 - INFO - __main__ - global_step = 150, average loss = 0.6527128890505992 09/29/2023 23:49:15 - INFO - __main__ - global_step = 200, average loss = 0.6255776268531917 09/29/2023 23:49:16 - INFO - __main__ - ***** Running evaluation ***** 09/29/2023 23:49:16 - INFO - __main__ - Num examples = 10000 09/29/2023 23:49:16 - INFO - __main__ - Batch size = 32 09/29/2023 23:53:34 - INFO - __main__ - ***** Eval results ***** 09/29/2023 23:53:34 - INFO - __main__ - acc = 0.3839 09/29/2023 23:54:05 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/29/2023 23:58:03 - INFO - __main__ - global_step = 250, average loss = 0.5687153974524699 09/30/2023 00:02:07 - INFO - __main__ - global_step = 300, average loss = 0.4650766727951122 09/30/2023 00:06:15 - INFO - __main__ - global_step = 350, average loss = 0.344281620121983 09/30/2023 00:10:25 - INFO - __main__ - global_step = 400, average loss = 0.2641717765412432 09/30/2023 00:10:26 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 00:10:26 - INFO - __main__ - Num examples = 10000 09/30/2023 00:10:26 - INFO - __main__ - Batch size = 32 09/30/2023 00:14:45 - INFO - __main__ - ***** Eval results ***** 09/30/2023 00:14:45 - INFO - __main__ - acc = 0.6657 09/30/2023 00:15:14 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 00:19:09 - INFO - __main__ - global_step = 450, average loss = 0.203622583138349 09/30/2023 00:23:15 - INFO - __main__ - global_step = 500, average loss = 0.19167841194193896 09/30/2023 00:27:33 - INFO - __main__ - global_step = 550, average loss = 0.1768511165331256 09/30/2023 00:31:46 - INFO - __main__ - global_step = 600, average loss = 0.17364913663874176 09/30/2023 00:31:47 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 00:31:47 - INFO - __main__ - Num examples = 10000 09/30/2023 00:31:47 - INFO - __main__ - Batch size = 32 09/30/2023 00:36:06 - INFO - __main__ - ***** Eval results ***** 09/30/2023 00:36:06 - INFO - __main__ - acc = 0.7383 09/30/2023 00:36:35 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 00:40:35 - INFO - __main__ - global_step = 650, average loss = 0.16046627445422929 09/30/2023 00:44:50 - INFO - __main__ - global_step = 700, average loss = 0.15604460480608395 09/30/2023 00:49:12 - INFO - __main__ - global_step = 750, average loss = 0.16073274322843645 09/30/2023 00:53:44 - INFO - __main__ - global_step = 800, average loss = 0.15695772335122457 09/30/2023 00:53:44 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 00:53:44 - INFO - __main__ - Num examples = 10000 09/30/2023 00:53:44 - INFO - __main__ - Batch size = 32 09/30/2023 00:58:03 - INFO - __main__ - ***** Eval results ***** 09/30/2023 00:58:03 - INFO - __main__ - acc = 0.7684 09/30/2023 00:58:33 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 01:02:32 - INFO - __main__ - global_step = 850, average loss = 0.14848782167286118 09/30/2023 01:06:57 - INFO - __main__ - global_step = 900, average loss = 0.12806821554375347 09/30/2023 01:11:28 - INFO - __main__ - global_step = 950, average loss = 0.1180885765995481 09/30/2023 01:15:52 - INFO - __main__ - global_step = 1000, average loss = 0.13545685631077503 09/30/2023 01:15:53 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 01:15:53 - INFO - __main__ - Num examples = 10000 09/30/2023 01:15:53 - INFO - __main__ - Batch size = 32 09/30/2023 01:20:11 - INFO - __main__ - ***** Eval results ***** 09/30/2023 01:20:11 - INFO - __main__ - acc = 0.7644 09/30/2023 01:24:17 - INFO - __main__ - global_step = 1050, average loss = 0.11866092401789502 09/30/2023 01:28:20 - INFO - __main__ - global_step = 1100, average loss = 0.12610675325471676 09/30/2023 01:32:47 - INFO - __main__ - global_step = 1150, average loss = 0.10549746582400985 09/30/2023 01:37:16 - INFO - __main__ - global_step = 1200, average loss = 0.12280375221620489 09/30/2023 01:37:17 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 01:37:17 - INFO - __main__ - Num examples = 10000 09/30/2023 01:37:17 - INFO - __main__ - Batch size = 32 09/30/2023 01:41:35 - INFO - __main__ - ***** Eval results ***** 09/30/2023 01:41:35 - INFO - __main__ - acc = 0.7802 09/30/2023 01:42:04 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 01:46:00 - INFO - __main__ - global_step = 1250, average loss = 0.11540970739923068 09/30/2023 01:50:18 - INFO - __main__ - global_step = 1300, average loss = 0.1098322441923665 09/30/2023 01:54:50 - INFO - __main__ - global_step = 1350, average loss = 0.12102181358681265 09/30/2023 01:59:20 - INFO - __main__ - global_step = 1400, average loss = 0.11920341529325014 09/30/2023 01:59:20 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 01:59:20 - INFO - __main__ - Num examples = 10000 09/30/2023 01:59:20 - INFO - __main__ - Batch size = 32 09/30/2023 02:03:40 - INFO - __main__ - ***** Eval results ***** 09/30/2023 02:03:40 - INFO - __main__ - acc = 0.7991 09/30/2023 02:04:09 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 02:08:14 - INFO - __main__ - global_step = 1450, average loss = 0.12416476066496215 09/30/2023 02:12:18 - INFO - __main__ - global_step = 1500, average loss = 0.11171700998882443 09/30/2023 02:16:39 - INFO - __main__ - global_step = 1550, average loss = 0.11893717237122474 09/30/2023 02:21:18 - INFO - __main__ - global_step = 1600, average loss = 0.11236542866332457 09/30/2023 02:21:18 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 02:21:18 - INFO - __main__ - Num examples = 10000 09/30/2023 02:21:18 - INFO - __main__ - Batch size = 32 09/30/2023 02:25:38 - INFO - __main__ - ***** Eval results ***** 09/30/2023 02:25:38 - INFO - __main__ - acc = 0.7998 09/30/2023 02:26:08 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 02:30:17 - INFO - __main__ - global_step = 1650, average loss = 0.11477049457775138 09/30/2023 02:34:26 - INFO - __main__ - global_step = 1700, average loss = 0.10185962059051235 09/30/2023 02:38:45 - INFO - __main__ - global_step = 1750, average loss = 0.08941184240770554 09/30/2023 02:43:11 - INFO - __main__ - global_step = 1800, average loss = 0.12326178842118679 09/30/2023 02:43:11 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 02:43:11 - INFO - __main__ - Num examples = 10000 09/30/2023 02:43:11 - INFO - __main__ - Batch size = 32 09/30/2023 02:47:30 - INFO - __main__ - ***** Eval results ***** 09/30/2023 02:47:30 - INFO - __main__ - acc = 0.7949 09/30/2023 02:51:33 - INFO - __main__ - global_step = 1850, average loss = 0.1172889139153267 09/30/2023 02:55:34 - INFO - __main__ - global_step = 1900, average loss = 0.11077741613984472 09/30/2023 02:59:53 - INFO - __main__ - global_step = 1950, average loss = 0.11476122897045571 09/30/2023 03:04:26 - INFO - __main__ - global_step = 2000, average loss = 0.11272342270149238 09/30/2023 03:04:27 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 03:04:27 - INFO - __main__ - Num examples = 10000 09/30/2023 03:04:27 - INFO - __main__ - Batch size = 32 09/30/2023 03:08:46 - INFO - __main__ - ***** Eval results ***** 09/30/2023 03:08:46 - INFO - __main__ - acc = 0.796 09/30/2023 03:12:55 - INFO - __main__ - global_step = 2050, average loss = 0.10859557473420864 09/30/2023 03:17:10 - INFO - __main__ - global_step = 2100, average loss = 0.09719053598862956 09/30/2023 03:21:26 - INFO - __main__ - global_step = 2150, average loss = 0.11492000469923369 09/30/2023 03:25:59 - INFO - __main__ - global_step = 2200, average loss = 0.09694181648810626 09/30/2023 03:25:59 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 03:25:59 - INFO - __main__ - Num examples = 10000 09/30/2023 03:25:59 - INFO - __main__ - Batch size = 32 09/30/2023 03:30:18 - INFO - __main__ - ***** Eval results ***** 09/30/2023 03:30:18 - INFO - __main__ - acc = 0.7974 09/30/2023 03:34:20 - INFO - __main__ - global_step = 2250, average loss = 0.10450371610718548 09/30/2023 03:38:29 - INFO - __main__ - global_step = 2300, average loss = 0.09968944377507796 09/30/2023 03:42:35 - INFO - __main__ - global_step = 2350, average loss = 0.09726969640512834 09/30/2023 03:46:47 - INFO - __main__ - global_step = 2400, average loss = 0.10790286644703884 09/30/2023 03:46:48 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 03:46:48 - INFO - __main__ - Num examples = 10000 09/30/2023 03:46:48 - INFO - __main__ - Batch size = 32 09/30/2023 03:51:06 - INFO - __main__ - ***** Eval results ***** 09/30/2023 03:51:06 - INFO - __main__ - acc = 0.8019 09/30/2023 03:51:36 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 03:55:37 - INFO - __main__ - global_step = 2450, average loss = 0.0904800341839109 09/30/2023 03:59:49 - INFO - __main__ - global_step = 2500, average loss = 0.09749648973207513 09/30/2023 04:04:09 - INFO - __main__ - global_step = 2550, average loss = 0.09015977876108082 09/30/2023 04:08:36 - INFO - __main__ - global_step = 2600, average loss = 0.11385933604056846 09/30/2023 04:08:37 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 04:08:37 - INFO - __main__ - Num examples = 10000 09/30/2023 04:08:37 - INFO - __main__ - Batch size = 32 09/30/2023 04:12:54 - INFO - __main__ - ***** Eval results ***** 09/30/2023 04:12:54 - INFO - __main__ - acc = 0.8079 09/30/2023 04:13:24 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 04:17:30 - INFO - __main__ - global_step = 2650, average loss = 0.09506087936344557 09/30/2023 04:21:44 - INFO - __main__ - global_step = 2700, average loss = 0.09819057766188052 09/30/2023 04:25:56 - INFO - __main__ - global_step = 2750, average loss = 0.09318019706217456 09/30/2023 04:30:01 - INFO - __main__ - global_step = 2800, average loss = 0.08744580631115241 09/30/2023 04:30:02 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 04:30:02 - INFO - __main__ - Num examples = 10000 09/30/2023 04:30:02 - INFO - __main__ - Batch size = 32 09/30/2023 04:34:20 - INFO - __main__ - ***** Eval results ***** 09/30/2023 04:34:20 - INFO - __main__ - acc = 0.8088 09/30/2023 04:34:50 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 04:39:07 - INFO - __main__ - global_step = 2850, average loss = 0.10302798340337177 09/30/2023 04:43:20 - INFO - __main__ - global_step = 2900, average loss = 0.09180921425198903 09/30/2023 04:47:38 - INFO - __main__ - global_step = 2950, average loss = 0.09286653973598731 09/30/2023 04:52:11 - INFO - __main__ - global_step = 3000, average loss = 0.09590554324422555 09/30/2023 04:52:12 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 04:52:12 - INFO - __main__ - Num examples = 10000 09/30/2023 04:52:12 - INFO - __main__ - Batch size = 32 09/30/2023 04:56:30 - INFO - __main__ - ***** Eval results ***** 09/30/2023 04:56:30 - INFO - __main__ - acc = 0.8082 09/30/2023 05:00:20 - INFO - __main__ - global_step = 3050, average loss = 0.0994117746003758 09/30/2023 05:04:34 - INFO - __main__ - global_step = 3100, average loss = 0.08591548198470264 09/30/2023 05:09:00 - INFO - __main__ - global_step = 3150, average loss = 0.09913339292746969 09/30/2023 05:13:29 - INFO - __main__ - global_step = 3200, average loss = 0.09553639550766092 09/30/2023 05:13:29 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 05:13:29 - INFO - __main__ - Num examples = 10000 09/30/2023 05:13:29 - INFO - __main__ - Batch size = 32 09/30/2023 05:17:46 - INFO - __main__ - ***** Eval results ***** 09/30/2023 05:17:46 - INFO - __main__ - acc = 0.8013 09/30/2023 05:21:55 - INFO - __main__ - global_step = 3250, average loss = 0.0932181820196638 09/30/2023 05:25:59 - INFO - __main__ - global_step = 3300, average loss = 0.08498929560689703 09/30/2023 05:30:21 - INFO - __main__ - global_step = 3350, average loss = 0.10022641647228739 09/30/2023 05:34:47 - INFO - __main__ - global_step = 3400, average loss = 0.08711659569285984 09/30/2023 05:34:47 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 05:34:47 - INFO - __main__ - Num examples = 10000 09/30/2023 05:34:47 - INFO - __main__ - Batch size = 32 09/30/2023 05:39:06 - INFO - __main__ - ***** Eval results ***** 09/30/2023 05:39:06 - INFO - __main__ - acc = 0.8085 09/30/2023 05:43:04 - INFO - __main__ - global_step = 3450, average loss = 0.08860307957234909 09/30/2023 05:47:15 - INFO - __main__ - global_step = 3500, average loss = 0.09122671313540195 09/30/2023 05:51:40 - INFO - __main__ - global_step = 3550, average loss = 0.09726192618174537 09/30/2023 05:56:06 - INFO - __main__ - global_step = 3600, average loss = 0.09295479882246582 09/30/2023 05:56:07 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 05:56:07 - INFO - __main__ - Num examples = 10000 09/30/2023 05:56:07 - INFO - __main__ - Batch size = 32 09/30/2023 06:00:25 - INFO - __main__ - ***** Eval results ***** 09/30/2023 06:00:25 - INFO - __main__ - acc = 0.7981 09/30/2023 06:04:25 - INFO - __main__ - global_step = 3650, average loss = 0.0850781474460382 09/30/2023 06:08:29 - INFO - __main__ - global_step = 3700, average loss = 0.08510007355012932 09/30/2023 06:12:45 - INFO - __main__ - global_step = 3750, average loss = 0.09091129492127947 09/30/2023 06:17:00 - INFO - __main__ - global_step = 3800, average loss = 0.08938177831689245 09/30/2023 06:17:01 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 06:17:01 - INFO - __main__ - Num examples = 10000 09/30/2023 06:17:01 - INFO - __main__ - Batch size = 32 09/30/2023 06:21:19 - INFO - __main__ - ***** Eval results ***** 09/30/2023 06:21:19 - INFO - __main__ - acc = 0.8008 09/30/2023 06:25:31 - INFO - __main__ - global_step = 3850, average loss = 0.09504610720792699 09/30/2023 06:29:46 - INFO - __main__ - global_step = 3900, average loss = 0.0801623915314849 09/30/2023 06:34:06 - INFO - __main__ - global_step = 3950, average loss = 0.08579662030970212 09/30/2023 06:38:28 - INFO - __main__ - global_step = 4000, average loss = 0.09399219373066443 09/30/2023 06:38:29 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 06:38:29 - INFO - __main__ - Num examples = 10000 09/30/2023 06:38:29 - INFO - __main__ - Batch size = 32 09/30/2023 06:42:47 - INFO - __main__ - ***** Eval results ***** 09/30/2023 06:42:47 - INFO - __main__ - acc = 0.8075 09/30/2023 06:46:50 - INFO - __main__ - global_step = 4050, average loss = 0.07777188256899535 09/30/2023 06:51:06 - INFO - __main__ - global_step = 4100, average loss = 0.09610467369071557 09/30/2023 06:55:28 - INFO - __main__ - global_step = 4150, average loss = 0.08811031442368403 09/30/2023 07:00:00 - INFO - __main__ - global_step = 4200, average loss = 0.08664546085885377 09/30/2023 07:00:01 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 07:00:01 - INFO - __main__ - Num examples = 10000 09/30/2023 07:00:01 - INFO - __main__ - Batch size = 32 09/30/2023 07:04:19 - INFO - __main__ - ***** Eval results ***** 09/30/2023 07:04:19 - INFO - __main__ - acc = 0.8193 09/30/2023 07:04:50 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 07:09:00 - INFO - __main__ - global_step = 4250, average loss = 0.0982984783052234 09/30/2023 07:13:25 - INFO - __main__ - global_step = 4300, average loss = 0.08057821323724056 09/30/2023 07:17:51 - INFO - __main__ - global_step = 4350, average loss = 0.08660443297441817 09/30/2023 07:22:18 - INFO - __main__ - global_step = 4400, average loss = 0.09301655420538736 09/30/2023 07:22:19 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 07:22:19 - INFO - __main__ - Num examples = 10000 09/30/2023 07:22:19 - INFO - __main__ - Batch size = 32 09/30/2023 07:26:36 - INFO - __main__ - ***** Eval results ***** 09/30/2023 07:26:36 - INFO - __main__ - acc = 0.8113 09/30/2023 07:30:33 - INFO - __main__ - global_step = 4450, average loss = 0.08599573986270116 09/30/2023 07:34:39 - INFO - __main__ - global_step = 4500, average loss = 0.08530666312639369 09/30/2023 07:38:48 - INFO - __main__ - global_step = 4550, average loss = 0.0846066818782856 09/30/2023 07:43:20 - INFO - __main__ - global_step = 4600, average loss = 0.0817996960383789 09/30/2023 07:43:21 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 07:43:21 - INFO - __main__ - Num examples = 10000 09/30/2023 07:43:21 - INFO - __main__ - Batch size = 32 09/30/2023 07:47:39 - INFO - __main__ - ***** Eval results ***** 09/30/2023 07:47:39 - INFO - __main__ - acc = 0.82 09/30/2023 07:48:09 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 07:52:15 - INFO - __main__ - global_step = 4650, average loss = 0.09457363621040712 09/30/2023 07:56:34 - INFO - __main__ - global_step = 4700, average loss = 0.09125612366977293 09/30/2023 08:01:01 - INFO - __main__ - global_step = 4750, average loss = 0.08600258652179037 09/30/2023 08:05:26 - INFO - __main__ - global_step = 4800, average loss = 0.09128527461645718 09/30/2023 08:05:26 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 08:05:26 - INFO - __main__ - Num examples = 10000 09/30/2023 08:05:26 - INFO - __main__ - Batch size = 32 09/30/2023 08:09:45 - INFO - __main__ - ***** Eval results ***** 09/30/2023 08:09:45 - INFO - __main__ - acc = 0.8151 09/30/2023 08:13:38 - INFO - __main__ - global_step = 4850, average loss = 0.09068508470605594 09/30/2023 08:17:36 - INFO - __main__ - global_step = 4900, average loss = 0.08361487613161443 09/30/2023 08:21:45 - INFO - __main__ - global_step = 4950, average loss = 0.09231334731652169 09/30/2023 08:26:13 - INFO - __main__ - global_step = 5000, average loss = 0.09210781741610845 09/30/2023 08:26:13 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 08:26:13 - INFO - __main__ - Num examples = 10000 09/30/2023 08:26:13 - INFO - __main__ - Batch size = 32 09/30/2023 08:30:31 - INFO - __main__ - ***** Eval results ***** 09/30/2023 08:30:31 - INFO - __main__ - acc = 0.8182 09/30/2023 08:34:31 - INFO - __main__ - global_step = 5050, average loss = 0.0987089884125453 09/30/2023 08:38:41 - INFO - __main__ - global_step = 5100, average loss = 0.08649987229902763 09/30/2023 08:43:07 - INFO - __main__ - global_step = 5150, average loss = 0.08150071838943404 09/30/2023 08:47:36 - INFO - __main__ - global_step = 5200, average loss = 0.09248840492458839 09/30/2023 08:47:36 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 08:47:36 - INFO - __main__ - Num examples = 10000 09/30/2023 08:47:36 - INFO - __main__ - Batch size = 32 09/30/2023 08:51:54 - INFO - __main__ - ***** Eval results ***** 09/30/2023 08:51:54 - INFO - __main__ - acc = 0.8098 09/30/2023 08:56:07 - INFO - __main__ - global_step = 5250, average loss = 0.08664297451652601 09/30/2023 09:00:14 - INFO - __main__ - global_step = 5300, average loss = 0.0810040804851451 09/30/2023 09:04:19 - INFO - __main__ - global_step = 5350, average loss = 0.08586231906258035 09/30/2023 09:08:41 - INFO - __main__ - global_step = 5400, average loss = 0.06912091931983014 09/30/2023 09:08:41 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 09:08:41 - INFO - __main__ - Num examples = 10000 09/30/2023 09:08:41 - INFO - __main__ - Batch size = 32 09/30/2023 09:12:59 - INFO - __main__ - ***** Eval results ***** 09/30/2023 09:12:59 - INFO - __main__ - acc = 0.8138 09/30/2023 09:17:04 - INFO - __main__ - global_step = 5450, average loss = 0.08094093154666553 09/30/2023 09:21:20 - INFO - __main__ - global_step = 5500, average loss = 0.08313021952490089 09/30/2023 09:25:34 - INFO - __main__ - global_step = 5550, average loss = 0.08020198410889862 09/30/2023 09:30:01 - INFO - __main__ - global_step = 5600, average loss = 0.08213623003844987 09/30/2023 09:30:01 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 09:30:01 - INFO - __main__ - Num examples = 10000 09/30/2023 09:30:01 - INFO - __main__ - Batch size = 32 09/30/2023 09:34:19 - INFO - __main__ - ***** Eval results ***** 09/30/2023 09:34:19 - INFO - __main__ - acc = 0.8138 09/30/2023 09:38:25 - INFO - __main__ - global_step = 5650, average loss = 0.0817357241499849 09/30/2023 09:42:30 - INFO - __main__ - global_step = 5700, average loss = 0.07617272696845248 09/30/2023 09:46:47 - INFO - __main__ - global_step = 5750, average loss = 0.08003306837461423 09/30/2023 09:51:07 - INFO - __main__ - global_step = 5800, average loss = 0.08461861441275687 09/30/2023 09:51:07 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 09:51:07 - INFO - __main__ - Num examples = 10000 09/30/2023 09:51:07 - INFO - __main__ - Batch size = 32 09/30/2023 09:55:24 - INFO - __main__ - ***** Eval results ***** 09/30/2023 09:55:24 - INFO - __main__ - acc = 0.819 09/30/2023 09:59:31 - INFO - __main__ - global_step = 5850, average loss = 0.0827079386992773 09/30/2023 10:03:45 - INFO - __main__ - global_step = 5900, average loss = 0.09033509934786707 09/30/2023 10:08:04 - INFO - __main__ - global_step = 5950, average loss = 0.08679367909935536 09/30/2023 10:12:29 - INFO - __main__ - global_step = 6000, average loss = 0.0677787430045646 09/30/2023 10:12:30 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 10:12:30 - INFO - __main__ - Num examples = 10000 09/30/2023 10:12:30 - INFO - __main__ - Batch size = 32 09/30/2023 10:16:48 - INFO - __main__ - ***** Eval results ***** 09/30/2023 10:16:48 - INFO - __main__ - acc = 0.793 09/30/2023 10:20:46 - INFO - __main__ - global_step = 6050, average loss = 0.07449474892706348 09/30/2023 10:24:57 - INFO - __main__ - global_step = 6100, average loss = 0.08253852118214126 09/30/2023 10:29:21 - INFO - __main__ - global_step = 6150, average loss = 0.07779288738580363 09/30/2023 10:33:50 - INFO - __main__ - global_step = 6200, average loss = 0.08415637877900735 09/30/2023 10:33:51 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 10:33:51 - INFO - __main__ - Num examples = 10000 09/30/2023 10:33:51 - INFO - __main__ - Batch size = 32 09/30/2023 10:38:09 - INFO - __main__ - ***** Eval results ***** 09/30/2023 10:38:09 - INFO - __main__ - acc = 0.8152 09/30/2023 10:42:10 - INFO - __main__ - global_step = 6250, average loss = 0.0836084969737567 09/30/2023 10:46:22 - INFO - __main__ - global_step = 6300, average loss = 0.09385589220066322 09/30/2023 10:50:35 - INFO - __main__ - global_step = 6350, average loss = 0.09158665712571747 09/30/2023 10:55:02 - INFO - __main__ - global_step = 6400, average loss = 0.0775194574438865 09/30/2023 10:55:03 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 10:55:03 - INFO - __main__ - Num examples = 10000 09/30/2023 10:55:03 - INFO - __main__ - Batch size = 32 09/30/2023 10:59:20 - INFO - __main__ - ***** Eval results ***** 09/30/2023 10:59:20 - INFO - __main__ - acc = 0.8155 09/30/2023 11:03:28 - INFO - __main__ - global_step = 6450, average loss = 0.08119687895305105 09/30/2023 11:07:51 - INFO - __main__ - global_step = 6500, average loss = 0.07420433169674652 09/30/2023 11:12:28 - INFO - __main__ - global_step = 6550, average loss = 0.06907126017362315 09/30/2023 11:16:58 - INFO - __main__ - global_step = 6600, average loss = 0.07694708627823274 09/30/2023 11:16:58 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 11:16:58 - INFO - __main__ - Num examples = 10000 09/30/2023 11:16:58 - INFO - __main__ - Batch size = 32 09/30/2023 11:21:17 - INFO - __main__ - ***** Eval results ***** 09/30/2023 11:21:17 - INFO - __main__ - acc = 0.8118 09/30/2023 11:25:39 - INFO - __main__ - global_step = 6650, average loss = 0.07814562884639599 09/30/2023 11:30:08 - INFO - __main__ - global_step = 6700, average loss = 0.08736841517616994 09/30/2023 11:34:35 - INFO - __main__ - global_step = 6750, average loss = 0.08082478447904577 09/30/2023 11:39:03 - INFO - __main__ - global_step = 6800, average loss = 0.07488631383661414 09/30/2023 11:39:04 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 11:39:04 - INFO - __main__ - Num examples = 10000 09/30/2023 11:39:04 - INFO - __main__ - Batch size = 32 09/30/2023 11:43:23 - INFO - __main__ - ***** Eval results ***** 09/30/2023 11:43:23 - INFO - __main__ - acc = 0.8213 09/30/2023 11:43:49 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 11:47:44 - INFO - __main__ - global_step = 6850, average loss = 0.08088931010104716 09/30/2023 11:51:57 - INFO - __main__ - global_step = 6900, average loss = 0.07495710194933053 09/30/2023 11:56:20 - INFO - __main__ - global_step = 6950, average loss = 0.08142732598964358 09/30/2023 12:00:40 - INFO - __main__ - global_step = 7000, average loss = 0.08055740728428645 09/30/2023 12:00:41 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 12:00:41 - INFO - __main__ - Num examples = 10000 09/30/2023 12:00:41 - INFO - __main__ - Batch size = 32 09/30/2023 12:04:58 - INFO - __main__ - ***** Eval results ***** 09/30/2023 12:04:58 - INFO - __main__ - acc = 0.8081 09/30/2023 12:08:49 - INFO - __main__ - global_step = 7050, average loss = 0.08094024127516604 09/30/2023 12:13:05 - INFO - __main__ - global_step = 7100, average loss = 0.08965814252063865 09/30/2023 12:17:22 - INFO - __main__ - global_step = 7150, average loss = 0.07722920090716798 09/30/2023 12:21:45 - INFO - __main__ - global_step = 7200, average loss = 0.08899519631431758 09/30/2023 12:21:46 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 12:21:46 - INFO - __main__ - Num examples = 10000 09/30/2023 12:21:46 - INFO - __main__ - Batch size = 32 09/30/2023 12:26:05 - INFO - __main__ - ***** Eval results ***** 09/30/2023 12:26:05 - INFO - __main__ - acc = 0.8124 09/30/2023 12:30:21 - INFO - __main__ - global_step = 7250, average loss = 0.06652378371007217 09/30/2023 12:34:39 - INFO - __main__ - global_step = 7300, average loss = 0.07190304783754982 09/30/2023 12:39:04 - INFO - __main__ - global_step = 7350, average loss = 0.07759228288079612 09/30/2023 12:43:26 - INFO - __main__ - global_step = 7400, average loss = 0.07959542326259907 09/30/2023 12:43:27 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 12:43:27 - INFO - __main__ - Num examples = 10000 09/30/2023 12:43:27 - INFO - __main__ - Batch size = 32 09/30/2023 12:47:45 - INFO - __main__ - ***** Eval results ***** 09/30/2023 12:47:45 - INFO - __main__ - acc = 0.8246 09/30/2023 12:48:12 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 12:52:13 - INFO - __main__ - global_step = 7450, average loss = 0.07954016777908691 09/30/2023 12:56:27 - INFO - __main__ - global_step = 7500, average loss = 0.06745836471483926 09/30/2023 13:00:43 - INFO - __main__ - global_step = 7550, average loss = 0.07651237843449053 09/30/2023 13:04:59 - INFO - __main__ - global_step = 7600, average loss = 0.08067735946224275 09/30/2023 13:05:00 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 13:05:00 - INFO - __main__ - Num examples = 10000 09/30/2023 13:05:00 - INFO - __main__ - Batch size = 32 09/30/2023 13:09:19 - INFO - __main__ - ***** Eval results ***** 09/30/2023 13:09:19 - INFO - __main__ - acc = 0.8296 09/30/2023 13:09:45 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 13:13:52 - INFO - __main__ - global_step = 7650, average loss = 0.07473264377593296 09/30/2023 13:18:02 - INFO - __main__ - global_step = 7700, average loss = 0.07815635729657515 09/30/2023 13:22:14 - INFO - __main__ - global_step = 7750, average loss = 0.08072268578209332 09/30/2023 13:26:29 - INFO - __main__ - global_step = 7800, average loss = 0.0779763015091885 09/30/2023 13:26:30 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 13:26:30 - INFO - __main__ - Num examples = 10000 09/30/2023 13:26:30 - INFO - __main__ - Batch size = 32 09/30/2023 13:30:49 - INFO - __main__ - ***** Eval results ***** 09/30/2023 13:30:49 - INFO - __main__ - acc = 0.8052 09/30/2023 13:34:56 - INFO - __main__ - global_step = 7850, average loss = 0.08846644978621043 09/30/2023 13:39:08 - INFO - __main__ - global_step = 7900, average loss = 0.08965322268464661 09/30/2023 13:43:18 - INFO - __main__ - global_step = 7950, average loss = 0.07646228883138974 09/30/2023 13:47:34 - INFO - __main__ - global_step = 8000, average loss = 0.06746727024801658 09/30/2023 13:47:35 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 13:47:35 - INFO - __main__ - Num examples = 10000 09/30/2023 13:47:35 - INFO - __main__ - Batch size = 32 09/30/2023 13:51:54 - INFO - __main__ - ***** Eval results ***** 09/30/2023 13:51:54 - INFO - __main__ - acc = 0.8243 09/30/2023 13:56:06 - INFO - __main__ - global_step = 8050, average loss = 0.08350399916278547 09/30/2023 14:00:19 - INFO - __main__ - global_step = 8100, average loss = 0.06798540580417466 09/30/2023 14:04:46 - INFO - __main__ - global_step = 8150, average loss = 0.06554304141827742 09/30/2023 14:09:04 - INFO - __main__ - global_step = 8200, average loss = 0.06514280185193229 09/30/2023 14:09:05 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 14:09:05 - INFO - __main__ - Num examples = 10000 09/30/2023 14:09:05 - INFO - __main__ - Batch size = 32 09/30/2023 14:13:23 - INFO - __main__ - ***** Eval results ***** 09/30/2023 14:13:23 - INFO - __main__ - acc = 0.8146 09/30/2023 14:17:36 - INFO - __main__ - global_step = 8250, average loss = 0.07990871949750726 09/30/2023 14:21:47 - INFO - __main__ - global_step = 8300, average loss = 0.07364155332470546 09/30/2023 14:25:52 - INFO - __main__ - global_step = 8350, average loss = 0.08377082656683342 09/30/2023 14:30:12 - INFO - __main__ - global_step = 8400, average loss = 0.07954915106311092 09/30/2023 14:30:13 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 14:30:13 - INFO - __main__ - Num examples = 10000 09/30/2023 14:30:13 - INFO - __main__ - Batch size = 32 09/30/2023 14:34:32 - INFO - __main__ - ***** Eval results ***** 09/30/2023 14:34:32 - INFO - __main__ - acc = 0.8148 09/30/2023 14:38:42 - INFO - __main__ - global_step = 8450, average loss = 0.07030039706209208 09/30/2023 14:42:55 - INFO - __main__ - global_step = 8500, average loss = 0.08003189989045495 09/30/2023 14:47:10 - INFO - __main__ - global_step = 8550, average loss = 0.07293609037540591 09/30/2023 14:51:25 - INFO - __main__ - global_step = 8600, average loss = 0.07146468496641319 09/30/2023 14:51:25 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 14:51:25 - INFO - __main__ - Num examples = 10000 09/30/2023 14:51:25 - INFO - __main__ - Batch size = 32 09/30/2023 14:55:43 - INFO - __main__ - ***** Eval results ***** 09/30/2023 14:55:43 - INFO - __main__ - acc = 0.8119 09/30/2023 14:59:48 - INFO - __main__ - global_step = 8650, average loss = 0.08003535972715327 09/30/2023 15:03:55 - INFO - __main__ - global_step = 8700, average loss = 0.06597046624192444 09/30/2023 15:08:18 - INFO - __main__ - global_step = 8750, average loss = 0.07181154116915422 09/30/2023 15:12:39 - INFO - __main__ - global_step = 8800, average loss = 0.068559150480869 09/30/2023 15:12:40 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 15:12:40 - INFO - __main__ - Num examples = 10000 09/30/2023 15:12:40 - INFO - __main__ - Batch size = 32 09/30/2023 15:16:57 - INFO - __main__ - ***** Eval results ***** 09/30/2023 15:16:57 - INFO - __main__ - acc = 0.8027 09/30/2023 15:20:57 - INFO - __main__ - global_step = 8850, average loss = 0.08192624930914462 09/30/2023 15:25:08 - INFO - __main__ - global_step = 8900, average loss = 0.06891920362562814 09/30/2023 15:29:21 - INFO - __main__ - global_step = 8950, average loss = 0.07183136703236868 09/30/2023 15:33:32 - INFO - __main__ - global_step = 9000, average loss = 0.07862215217377524 09/30/2023 15:33:32 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 15:33:32 - INFO - __main__ - Num examples = 10000 09/30/2023 15:33:32 - INFO - __main__ - Batch size = 32 09/30/2023 15:37:51 - INFO - __main__ - ***** Eval results ***** 09/30/2023 15:37:51 - INFO - __main__ - acc = 0.8145 09/30/2023 15:42:00 - INFO - __main__ - global_step = 9050, average loss = 0.08039317954942816 09/30/2023 15:46:04 - INFO - __main__ - global_step = 9100, average loss = 0.07681855217753991 09/30/2023 15:50:19 - INFO - __main__ - global_step = 9150, average loss = 0.06908466021588539 09/30/2023 15:54:39 - INFO - __main__ - global_step = 9200, average loss = 0.07285123934067088 09/30/2023 15:54:40 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 15:54:40 - INFO - __main__ - Num examples = 10000 09/30/2023 15:54:40 - INFO - __main__ - Batch size = 32 09/30/2023 15:58:58 - INFO - __main__ - ***** Eval results ***** 09/30/2023 15:58:58 - INFO - __main__ - acc = 0.8157 09/30/2023 16:03:12 - INFO - __main__ - global_step = 9250, average loss = 0.07457796319955377 09/30/2023 16:07:29 - INFO - __main__ - global_step = 9300, average loss = 0.08509899367534672 09/30/2023 16:11:53 - INFO - __main__ - global_step = 9350, average loss = 0.07013603730166323 09/30/2023 16:16:21 - INFO - __main__ - global_step = 9400, average loss = 0.07017059165984392 09/30/2023 16:16:22 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 16:16:22 - INFO - __main__ - Num examples = 10000 09/30/2023 16:16:22 - INFO - __main__ - Batch size = 32 09/30/2023 16:20:40 - INFO - __main__ - ***** Eval results ***** 09/30/2023 16:20:40 - INFO - __main__ - acc = 0.8141 09/30/2023 16:24:51 - INFO - __main__ - global_step = 9450, average loss = 0.0831688746976215 09/30/2023 16:29:17 - INFO - __main__ - global_step = 9500, average loss = 0.08576202854252188 09/30/2023 16:33:37 - INFO - __main__ - global_step = 9550, average loss = 0.08213058317254764 09/30/2023 16:37:58 - INFO - __main__ - global_step = 9600, average loss = 0.072965028858016 09/30/2023 16:37:58 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 16:37:58 - INFO - __main__ - Num examples = 10000 09/30/2023 16:37:58 - INFO - __main__ - Batch size = 32 09/30/2023 16:42:15 - INFO - __main__ - ***** Eval results ***** 09/30/2023 16:42:15 - INFO - __main__ - acc = 0.8122 09/30/2023 16:46:15 - INFO - __main__ - global_step = 9650, average loss = 0.07125714480011083 09/30/2023 16:50:19 - INFO - __main__ - global_step = 9700, average loss = 0.07434062254025775 09/30/2023 16:54:30 - INFO - __main__ - global_step = 9750, average loss = 0.07218598224179004 09/30/2023 16:58:52 - INFO - __main__ - global_step = 9800, average loss = 0.06753908861952368 09/30/2023 16:58:52 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 16:58:52 - INFO - __main__ - Num examples = 10000 09/30/2023 16:58:52 - INFO - __main__ - Batch size = 32 09/30/2023 17:03:10 - INFO - __main__ - ***** Eval results ***** 09/30/2023 17:03:10 - INFO - __main__ - acc = 0.8208 09/30/2023 17:07:12 - INFO - __main__ - global_step = 9850, average loss = 0.0787789156648796 09/30/2023 17:11:24 - INFO - __main__ - global_step = 9900, average loss = 0.06863431145990034 09/30/2023 17:15:44 - INFO - __main__ - global_step = 9950, average loss = 0.0729100130192819 09/30/2023 17:20:01 - INFO - __main__ - global_step = 10000, average loss = 0.07118722895695101 09/30/2023 17:20:01 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 17:20:01 - INFO - __main__ - Num examples = 10000 09/30/2023 17:20:01 - INFO - __main__ - Batch size = 32 09/30/2023 17:24:20 - INFO - __main__ - ***** Eval results ***** 09/30/2023 17:24:20 - INFO - __main__ - acc = 0.8212 09/30/2023 17:28:25 - INFO - __main__ - global_step = 10050, average loss = 0.06967489041242515 09/30/2023 17:32:40 - INFO - __main__ - global_step = 10100, average loss = 0.07503812584323896 09/30/2023 17:37:07 - INFO - __main__ - global_step = 10150, average loss = 0.07984486830362585 09/30/2023 17:41:19 - INFO - __main__ - global_step = 10200, average loss = 0.06886661994401948 09/30/2023 17:41:19 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 17:41:19 - INFO - __main__ - Num examples = 10000 09/30/2023 17:41:19 - INFO - __main__ - Batch size = 32 09/30/2023 17:45:37 - INFO - __main__ - ***** Eval results ***** 09/30/2023 17:45:37 - INFO - __main__ - acc = 0.8134 09/30/2023 17:49:55 - INFO - __main__ - global_step = 10250, average loss = 0.07405807184350124 09/30/2023 17:54:14 - INFO - __main__ - global_step = 10300, average loss = 0.08030594819738326 09/30/2023 17:58:33 - INFO - __main__ - global_step = 10350, average loss = 0.08568550381663954 09/30/2023 18:02:39 - INFO - __main__ - global_step = 10400, average loss = 0.08110691699486779 09/30/2023 18:02:39 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 18:02:39 - INFO - __main__ - Num examples = 10000 09/30/2023 18:02:39 - INFO - __main__ - Batch size = 32 09/30/2023 18:07:00 - INFO - __main__ - ***** Eval results ***** 09/30/2023 18:07:00 - INFO - __main__ - acc = 0.8226 09/30/2023 18:10:59 - INFO - __main__ - global_step = 10450, average loss = 0.07698049577564234 09/30/2023 18:15:18 - INFO - __main__ - global_step = 10500, average loss = 0.07489776252514276 09/30/2023 18:19:38 - INFO - __main__ - global_step = 10550, average loss = 0.08084082975808997 09/30/2023 18:24:06 - INFO - __main__ - global_step = 10600, average loss = 0.077233616621088 09/30/2023 18:24:06 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 18:24:06 - INFO - __main__ - Num examples = 10000 09/30/2023 18:24:06 - INFO - __main__ - Batch size = 32 09/30/2023 18:28:26 - INFO - __main__ - ***** Eval results ***** 09/30/2023 18:28:26 - INFO - __main__ - acc = 0.8219 09/30/2023 18:32:23 - INFO - __main__ - global_step = 10650, average loss = 0.0749396042097942 09/30/2023 18:36:24 - INFO - __main__ - global_step = 10700, average loss = 0.07035453407006571 09/30/2023 18:40:30 - INFO - __main__ - global_step = 10750, average loss = 0.0701333080389304 09/30/2023 18:44:44 - INFO - __main__ - global_step = 10800, average loss = 0.06815460226869618 09/30/2023 18:44:45 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 18:44:45 - INFO - __main__ - Num examples = 10000 09/30/2023 18:44:45 - INFO - __main__ - Batch size = 32 09/30/2023 18:49:04 - INFO - __main__ - ***** Eval results ***** 09/30/2023 18:49:04 - INFO - __main__ - acc = 0.8246 09/30/2023 18:53:04 - INFO - __main__ - global_step = 10850, average loss = 0.06231740675430046 09/30/2023 18:57:11 - INFO - __main__ - global_step = 10900, average loss = 0.07749273380759406 09/30/2023 19:01:27 - INFO - __main__ - global_step = 10950, average loss = 0.07356921623417292 09/30/2023 19:05:44 - INFO - __main__ - global_step = 11000, average loss = 0.06861940244401922 09/30/2023 19:05:44 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 19:05:44 - INFO - __main__ - Num examples = 10000 09/30/2023 19:05:44 - INFO - __main__ - Batch size = 32 09/30/2023 19:10:04 - INFO - __main__ - ***** Eval results ***** 09/30/2023 19:10:04 - INFO - __main__ - acc = 0.8237 09/30/2023 19:13:58 - INFO - __main__ - global_step = 11050, average loss = 0.07190075869159046 09/30/2023 19:18:18 - INFO - __main__ - global_step = 11100, average loss = 0.07798185770014243 09/30/2023 19:22:25 - INFO - __main__ - global_step = 11150, average loss = 0.05461175944059505 09/30/2023 19:26:36 - INFO - __main__ - global_step = 11200, average loss = 0.07214928590841736 09/30/2023 19:26:36 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 19:26:36 - INFO - __main__ - Num examples = 10000 09/30/2023 19:26:36 - INFO - __main__ - Batch size = 32 09/30/2023 19:30:56 - INFO - __main__ - ***** Eval results ***** 09/30/2023 19:30:56 - INFO - __main__ - acc = 0.8281 09/30/2023 19:34:46 - INFO - __main__ - global_step = 11250, average loss = 0.07595877689196641 09/30/2023 19:38:51 - INFO - __main__ - global_step = 11300, average loss = 0.06289271867310163 09/30/2023 19:42:58 - INFO - __main__ - global_step = 11350, average loss = 0.07287138866693567 09/30/2023 19:47:05 - INFO - __main__ - global_step = 11400, average loss = 0.0736375573805708 09/30/2023 19:47:05 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 19:47:05 - INFO - __main__ - Num examples = 10000 09/30/2023 19:47:05 - INFO - __main__ - Batch size = 32 09/30/2023 19:51:26 - INFO - __main__ - ***** Eval results ***** 09/30/2023 19:51:26 - INFO - __main__ - acc = 0.8265 09/30/2023 19:55:14 - INFO - __main__ - global_step = 11450, average loss = 0.07105860608404328 09/30/2023 19:59:22 - INFO - __main__ - global_step = 11500, average loss = 0.07589100849851092 09/30/2023 20:03:31 - INFO - __main__ - global_step = 11550, average loss = 0.07193597211022279 09/30/2023 20:07:44 - INFO - __main__ - global_step = 11600, average loss = 0.0786158631305443 09/30/2023 20:07:45 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 20:07:45 - INFO - __main__ - Num examples = 10000 09/30/2023 20:07:45 - INFO - __main__ - Batch size = 32 09/30/2023 20:12:05 - INFO - __main__ - ***** Eval results ***** 09/30/2023 20:12:05 - INFO - __main__ - acc = 0.8224 09/30/2023 20:16:14 - INFO - __main__ - global_step = 11650, average loss = 0.07484395604304155 09/30/2023 20:20:16 - INFO - __main__ - global_step = 11700, average loss = 0.07182746810896788 09/30/2023 20:24:28 - INFO - __main__ - global_step = 11750, average loss = 0.06392118992527684 09/30/2023 20:28:47 - INFO - __main__ - global_step = 11800, average loss = 0.06359485059540021 09/30/2023 20:28:48 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 20:28:48 - INFO - __main__ - Num examples = 10000 09/30/2023 20:28:48 - INFO - __main__ - Batch size = 32 09/30/2023 20:33:07 - INFO - __main__ - ***** Eval results ***** 09/30/2023 20:33:07 - INFO - __main__ - acc = 0.8225 09/30/2023 20:36:55 - INFO - __main__ - global_step = 11850, average loss = 0.06557874951142367 09/30/2023 20:40:51 - INFO - __main__ - global_step = 11900, average loss = 0.06787695961887948 09/30/2023 20:45:01 - INFO - __main__ - global_step = 11950, average loss = 0.07802391385892406 09/30/2023 20:49:19 - INFO - __main__ - global_step = 12000, average loss = 0.062383338503277624 09/30/2023 20:49:19 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 20:49:19 - INFO - __main__ - Num examples = 10000 09/30/2023 20:49:19 - INFO - __main__ - Batch size = 32 09/30/2023 20:53:41 - INFO - __main__ - ***** Eval results ***** 09/30/2023 20:53:41 - INFO - __main__ - acc = 0.8221 09/30/2023 20:57:31 - INFO - __main__ - global_step = 12050, average loss = 0.07041985652205768 09/30/2023 21:01:33 - INFO - __main__ - global_step = 12100, average loss = 0.07904728068271652 09/30/2023 21:05:47 - INFO - __main__ - global_step = 12150, average loss = 0.07474817682654247 09/30/2023 21:09:58 - INFO - __main__ - global_step = 12200, average loss = 0.07402907914118259 09/30/2023 21:09:58 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 21:09:58 - INFO - __main__ - Num examples = 10000 09/30/2023 21:09:58 - INFO - __main__ - Batch size = 32 09/30/2023 21:14:19 - INFO - __main__ - ***** Eval results ***** 09/30/2023 21:14:19 - INFO - __main__ - acc = 0.8327 09/30/2023 21:14:46 - INFO - __main__ - Saving model checkpoint to output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 09/30/2023 21:18:46 - INFO - __main__ - global_step = 12250, average loss = 0.07039213450989337 09/30/2023 21:22:59 - INFO - __main__ - global_step = 12300, average loss = 0.0842395970186044 09/30/2023 21:27:05 - INFO - __main__ - global_step = 12350, average loss = 0.06603515204827999 09/30/2023 21:31:22 - INFO - __main__ - global_step = 12400, average loss = 0.06760421821546515 09/30/2023 21:31:22 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 21:31:22 - INFO - __main__ - Num examples = 10000 09/30/2023 21:31:22 - INFO - __main__ - Batch size = 32 09/30/2023 21:35:43 - INFO - __main__ - ***** Eval results ***** 09/30/2023 21:35:43 - INFO - __main__ - acc = 0.8208 09/30/2023 21:39:33 - INFO - __main__ - global_step = 12450, average loss = 0.06917047601906233 09/30/2023 21:43:44 - INFO - __main__ - global_step = 12500, average loss = 0.07573592953915068 09/30/2023 21:48:03 - INFO - __main__ - global_step = 12550, average loss = 0.06653125052485848 09/30/2023 21:52:22 - INFO - __main__ - global_step = 12600, average loss = 0.06815064429247286 09/30/2023 21:52:23 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 21:52:23 - INFO - __main__ - Num examples = 10000 09/30/2023 21:52:23 - INFO - __main__ - Batch size = 32 09/30/2023 21:56:43 - INFO - __main__ - ***** Eval results ***** 09/30/2023 21:56:43 - INFO - __main__ - acc = 0.819 09/30/2023 22:00:39 - INFO - __main__ - global_step = 12650, average loss = 0.07732899946378893 09/30/2023 22:04:44 - INFO - __main__ - global_step = 12700, average loss = 0.06547158910783764 09/30/2023 22:08:49 - INFO - __main__ - global_step = 12750, average loss = 0.0728905378174386 09/30/2023 22:13:03 - INFO - __main__ - global_step = 12800, average loss = 0.07366545890477937 09/30/2023 22:13:04 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 22:13:04 - INFO - __main__ - Num examples = 10000 09/30/2023 22:13:04 - INFO - __main__ - Batch size = 32 09/30/2023 22:17:25 - INFO - __main__ - ***** Eval results ***** 09/30/2023 22:17:25 - INFO - __main__ - acc = 0.8182 09/30/2023 22:21:28 - INFO - __main__ - global_step = 12850, average loss = 0.06894337675126735 09/30/2023 22:25:41 - INFO - __main__ - global_step = 12900, average loss = 0.07351460054007475 09/30/2023 22:29:49 - INFO - __main__ - global_step = 12950, average loss = 0.0674650944762834 09/30/2023 22:34:09 - INFO - __main__ - global_step = 13000, average loss = 0.07850258736492834 09/30/2023 22:34:09 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 22:34:09 - INFO - __main__ - Num examples = 10000 09/30/2023 22:34:09 - INFO - __main__ - Batch size = 32 09/30/2023 22:38:30 - INFO - __main__ - ***** Eval results ***** 09/30/2023 22:38:30 - INFO - __main__ - acc = 0.8321 09/30/2023 22:42:24 - INFO - __main__ - global_step = 13050, average loss = 0.07653208828101925 09/30/2023 22:46:20 - INFO - __main__ - global_step = 13100, average loss = 0.06802368102005857 09/30/2023 22:50:29 - INFO - __main__ - global_step = 13150, average loss = 0.06454230795552576 09/30/2023 22:54:34 - INFO - __main__ - global_step = 13200, average loss = 0.07258539929578546 09/30/2023 22:54:35 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 22:54:35 - INFO - __main__ - Num examples = 10000 09/30/2023 22:54:35 - INFO - __main__ - Batch size = 32 09/30/2023 22:58:54 - INFO - __main__ - ***** Eval results ***** 09/30/2023 22:58:54 - INFO - __main__ - acc = 0.8252 09/30/2023 23:02:57 - INFO - __main__ - global_step = 13250, average loss = 0.07325911161562544 09/30/2023 23:07:12 - INFO - __main__ - global_step = 13300, average loss = 0.06880584957727479 09/30/2023 23:11:21 - INFO - __main__ - global_step = 13350, average loss = 0.07009069720297703 09/30/2023 23:15:34 - INFO - __main__ - global_step = 13400, average loss = 0.07083460625182852 09/30/2023 23:15:35 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 23:15:35 - INFO - __main__ - Num examples = 10000 09/30/2023 23:15:35 - INFO - __main__ - Batch size = 32 09/30/2023 23:19:56 - INFO - __main__ - ***** Eval results ***** 09/30/2023 23:19:56 - INFO - __main__ - acc = 0.813 09/30/2023 23:23:55 - INFO - __main__ - global_step = 13450, average loss = 0.06977577161625959 09/30/2023 23:27:49 - INFO - __main__ - global_step = 13500, average loss = 0.0730690676838276 09/30/2023 23:31:51 - INFO - __main__ - global_step = 13550, average loss = 0.07233811266596604 09/30/2023 23:35:53 - INFO - __main__ - global_step = 13600, average loss = 0.0773136636797426 09/30/2023 23:35:54 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 23:35:54 - INFO - __main__ - Num examples = 10000 09/30/2023 23:35:54 - INFO - __main__ - Batch size = 32 09/30/2023 23:40:14 - INFO - __main__ - ***** Eval results ***** 09/30/2023 23:40:14 - INFO - __main__ - acc = 0.8254 09/30/2023 23:44:18 - INFO - __main__ - global_step = 13650, average loss = 0.0625762648001546 09/30/2023 23:48:29 - INFO - __main__ - global_step = 13700, average loss = 0.07835062241327251 09/30/2023 23:52:47 - INFO - __main__ - global_step = 13750, average loss = 0.06917831582177314 09/30/2023 23:57:06 - INFO - __main__ - global_step = 13800, average loss = 0.06653823942549934 09/30/2023 23:57:07 - INFO - __main__ - ***** Running evaluation ***** 09/30/2023 23:57:07 - INFO - __main__ - Num examples = 10000 09/30/2023 23:57:07 - INFO - __main__ - Batch size = 32 10/01/2023 00:01:27 - INFO - __main__ - ***** Eval results ***** 10/01/2023 00:01:27 - INFO - __main__ - acc = 0.8231 10/01/2023 00:05:24 - INFO - __main__ - global_step = 13850, average loss = 0.07134979092643334 10/01/2023 00:09:31 - INFO - __main__ - global_step = 13900, average loss = 0.07882154490274842 10/01/2023 00:13:33 - INFO - __main__ - global_step = 13950, average loss = 0.067044138008132 10/01/2023 00:17:54 - INFO - __main__ - global_step = 14000, average loss = 0.06602240080737828 10/01/2023 00:17:55 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 00:17:55 - INFO - __main__ - Num examples = 10000 10/01/2023 00:17:55 - INFO - __main__ - Batch size = 32 10/01/2023 00:22:16 - INFO - __main__ - ***** Eval results ***** 10/01/2023 00:22:16 - INFO - __main__ - acc = 0.8185 10/01/2023 00:26:20 - INFO - __main__ - global_step = 14050, average loss = 0.07546966458212409 10/01/2023 00:30:49 - INFO - __main__ - global_step = 14100, average loss = 0.06855787578620948 10/01/2023 00:35:15 - INFO - __main__ - global_step = 14150, average loss = 0.06737258993505747 10/01/2023 00:39:39 - INFO - __main__ - global_step = 14200, average loss = 0.05966844407041208 10/01/2023 00:39:40 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 00:39:40 - INFO - __main__ - Num examples = 10000 10/01/2023 00:39:40 - INFO - __main__ - Batch size = 32 10/01/2023 00:44:00 - INFO - __main__ - ***** Eval results ***** 10/01/2023 00:44:00 - INFO - __main__ - acc = 0.8282 10/01/2023 00:47:56 - INFO - __main__ - global_step = 14250, average loss = 0.0709371871012263 10/01/2023 00:51:54 - INFO - __main__ - global_step = 14300, average loss = 0.07779215545522675 10/01/2023 00:56:02 - INFO - __main__ - global_step = 14350, average loss = 0.06590510867084959 10/01/2023 01:00:08 - INFO - __main__ - global_step = 14400, average loss = 0.061885312875092496 10/01/2023 01:00:09 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 01:00:09 - INFO - __main__ - Num examples = 10000 10/01/2023 01:00:09 - INFO - __main__ - Batch size = 32 10/01/2023 01:04:29 - INFO - __main__ - ***** Eval results ***** 10/01/2023 01:04:29 - INFO - __main__ - acc = 0.8195 10/01/2023 01:08:20 - INFO - __main__ - global_step = 14450, average loss = 0.07757491528376705 10/01/2023 01:12:26 - INFO - __main__ - global_step = 14500, average loss = 0.061351443203457166 10/01/2023 01:16:44 - INFO - __main__ - global_step = 14550, average loss = 0.06742463728594884 10/01/2023 01:20:55 - INFO - __main__ - global_step = 14600, average loss = 0.06395716872473713 10/01/2023 01:20:56 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 01:20:56 - INFO - __main__ - Num examples = 10000 10/01/2023 01:20:56 - INFO - __main__ - Batch size = 32 10/01/2023 01:25:16 - INFO - __main__ - ***** Eval results ***** 10/01/2023 01:25:16 - INFO - __main__ - acc = 0.8271 10/01/2023 01:29:11 - INFO - __main__ - global_step = 14650, average loss = 0.0680865884249215 10/01/2023 01:33:17 - INFO - __main__ - global_step = 14700, average loss = 0.07319515083199804 10/01/2023 01:37:31 - INFO - __main__ - global_step = 14750, average loss = 0.0750861974158397 10/01/2023 01:41:39 - INFO - __main__ - global_step = 14800, average loss = 0.07455838610287174 10/01/2023 01:41:39 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 01:41:39 - INFO - __main__ - Num examples = 10000 10/01/2023 01:41:39 - INFO - __main__ - Batch size = 32 10/01/2023 01:45:59 - INFO - __main__ - ***** Eval results ***** 10/01/2023 01:45:59 - INFO - __main__ - acc = 0.8285 10/01/2023 01:49:49 - INFO - __main__ - global_step = 14850, average loss = 0.0746920863639025 10/01/2023 01:53:48 - INFO - __main__ - global_step = 14900, average loss = 0.06193213762038795 10/01/2023 01:58:00 - INFO - __main__ - global_step = 14950, average loss = 0.0684903811987897 10/01/2023 02:02:20 - INFO - __main__ - global_step = 15000, average loss = 0.07475626632280181 10/01/2023 02:02:21 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 02:02:21 - INFO - __main__ - Num examples = 10000 10/01/2023 02:02:21 - INFO - __main__ - Batch size = 32 10/01/2023 02:06:40 - INFO - __main__ - ***** Eval results ***** 10/01/2023 02:06:40 - INFO - __main__ - acc = 0.8221 10/01/2023 02:10:33 - INFO - __main__ - global_step = 15050, average loss = 0.06398421550955391 10/01/2023 02:14:31 - INFO - __main__ - global_step = 15100, average loss = 0.07387388837814797 10/01/2023 02:18:36 - INFO - __main__ - global_step = 15150, average loss = 0.07215547483820046 10/01/2023 02:22:42 - INFO - __main__ - global_step = 15200, average loss = 0.06692371807614109 10/01/2023 02:22:42 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 02:22:42 - INFO - __main__ - Num examples = 10000 10/01/2023 02:22:42 - INFO - __main__ - Batch size = 32 10/01/2023 02:27:06 - INFO - __main__ - ***** Eval results ***** 10/01/2023 02:27:06 - INFO - __main__ - acc = 0.828 10/01/2023 02:31:03 - INFO - __main__ - global_step = 15250, average loss = 0.07475481618889716 10/01/2023 02:35:03 - INFO - __main__ - global_step = 15300, average loss = 0.06605282124131918 10/01/2023 02:39:06 - INFO - __main__ - global_step = 15350, average loss = 0.0742860847054817 10/01/2023 02:43:08 - INFO - __main__ - global_step = 15400, average loss = 0.06508645007126689 10/01/2023 02:43:09 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 02:43:09 - INFO - __main__ - Num examples = 10000 10/01/2023 02:43:09 - INFO - __main__ - Batch size = 32 10/01/2023 02:47:27 - INFO - __main__ - ***** Eval results ***** 10/01/2023 02:47:27 - INFO - __main__ - acc = 0.8244 10/01/2023 02:51:15 - INFO - __main__ - global_step = 15450, average loss = 0.0657403554152188 10/01/2023 02:55:25 - INFO - __main__ - global_step = 15500, average loss = 0.06363382869447377 10/01/2023 02:59:33 - INFO - __main__ - global_step = 15550, average loss = 0.068332606570184 10/01/2023 03:03:36 - INFO - __main__ - global_step = 15600, average loss = 0.0638002801532275 10/01/2023 03:03:37 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 03:03:37 - INFO - __main__ - Num examples = 10000 10/01/2023 03:03:37 - INFO - __main__ - Batch size = 32 10/01/2023 03:07:54 - INFO - __main__ - ***** Eval results ***** 10/01/2023 03:07:54 - INFO - __main__ - acc = 0.8245 10/01/2023 03:11:47 - INFO - __main__ - global_step = 15650, average loss = 0.07057813088395051 10/01/2023 03:15:48 - INFO - __main__ - global_step = 15700, average loss = 0.059586076617561046 10/01/2023 03:19:50 - INFO - __main__ - global_step = 15750, average loss = 0.06329842852351249 10/01/2023 03:24:07 - INFO - __main__ - global_step = 15800, average loss = 0.0673095579940309 10/01/2023 03:24:08 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 03:24:08 - INFO - __main__ - Num examples = 10000 10/01/2023 03:24:08 - INFO - __main__ - Batch size = 32 10/01/2023 03:28:27 - INFO - __main__ - ***** Eval results ***** 10/01/2023 03:28:27 - INFO - __main__ - acc = 0.8191 10/01/2023 03:32:25 - INFO - __main__ - global_step = 15850, average loss = 0.06719043602446619 10/01/2023 03:36:22 - INFO - __main__ - global_step = 15900, average loss = 0.06470626855618321 10/01/2023 03:40:22 - INFO - __main__ - global_step = 15950, average loss = 0.0673678615699464 10/01/2023 03:44:32 - INFO - __main__ - global_step = 16000, average loss = 0.06654785299411742 10/01/2023 03:44:32 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 03:44:32 - INFO - __main__ - Num examples = 10000 10/01/2023 03:44:32 - INFO - __main__ - Batch size = 32 10/01/2023 03:48:51 - INFO - __main__ - ***** Eval results ***** 10/01/2023 03:48:51 - INFO - __main__ - acc = 0.826 10/01/2023 03:52:42 - INFO - __main__ - global_step = 16050, average loss = 0.07211193255971012 10/01/2023 03:56:30 - INFO - __main__ - global_step = 16100, average loss = 0.07810956820030697 10/01/2023 04:00:37 - INFO - __main__ - global_step = 16150, average loss = 0.07871339554849328 10/01/2023 04:04:48 - INFO - __main__ - global_step = 16200, average loss = 0.06766451962915199 10/01/2023 04:04:48 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 04:04:48 - INFO - __main__ - Num examples = 10000 10/01/2023 04:04:48 - INFO - __main__ - Batch size = 32 10/01/2023 04:09:07 - INFO - __main__ - ***** Eval results ***** 10/01/2023 04:09:07 - INFO - __main__ - acc = 0.8234 10/01/2023 04:13:00 - INFO - __main__ - global_step = 16250, average loss = 0.07233332002186216 10/01/2023 04:17:08 - INFO - __main__ - global_step = 16300, average loss = 0.06269402921956498 10/01/2023 04:21:18 - INFO - __main__ - global_step = 16350, average loss = 0.066974333815524 10/01/2023 04:25:36 - INFO - __main__ - global_step = 16400, average loss = 0.06326851320967762 10/01/2023 04:25:36 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 04:25:36 - INFO - __main__ - Num examples = 10000 10/01/2023 04:25:36 - INFO - __main__ - Batch size = 32 10/01/2023 04:29:55 - INFO - __main__ - ***** Eval results ***** 10/01/2023 04:29:55 - INFO - __main__ - acc = 0.8218 10/01/2023 04:33:53 - INFO - __main__ - global_step = 16450, average loss = 0.0583337911261151 10/01/2023 04:38:00 - INFO - __main__ - global_step = 16500, average loss = 0.06651346774706327 10/01/2023 04:42:10 - INFO - __main__ - global_step = 16550, average loss = 0.07442569829370768 10/01/2023 04:46:19 - INFO - __main__ - global_step = 16600, average loss = 0.0704036247156182 10/01/2023 04:46:19 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 04:46:19 - INFO - __main__ - Num examples = 10000 10/01/2023 04:46:19 - INFO - __main__ - Batch size = 32 10/01/2023 04:50:38 - INFO - __main__ - ***** Eval results ***** 10/01/2023 04:50:38 - INFO - __main__ - acc = 0.8268 10/01/2023 04:54:40 - INFO - __main__ - global_step = 16650, average loss = 0.07102784802380484 10/01/2023 04:58:39 - INFO - __main__ - global_step = 16700, average loss = 0.07482151540141785 10/01/2023 05:02:48 - INFO - __main__ - global_step = 16750, average loss = 0.06266404812475229 10/01/2023 05:06:49 - INFO - __main__ - global_step = 16800, average loss = 0.06936132206232287 10/01/2023 05:06:50 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 05:06:50 - INFO - __main__ - Num examples = 10000 10/01/2023 05:06:50 - INFO - __main__ - Batch size = 32 10/01/2023 05:11:07 - INFO - __main__ - ***** Eval results ***** 10/01/2023 05:11:07 - INFO - __main__ - acc = 0.8313 10/01/2023 05:15:16 - INFO - __main__ - global_step = 16850, average loss = 0.060352628196997105 10/01/2023 05:19:33 - INFO - __main__ - global_step = 16900, average loss = 0.0641949670168833 10/01/2023 05:23:53 - INFO - __main__ - global_step = 16950, average loss = 0.0711748162342701 10/01/2023 05:28:04 - INFO - __main__ - global_step = 17000, average loss = 0.07767359625780955 10/01/2023 05:28:05 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 05:28:05 - INFO - __main__ - Num examples = 10000 10/01/2023 05:28:05 - INFO - __main__ - Batch size = 32 10/01/2023 05:32:22 - INFO - __main__ - ***** Eval results ***** 10/01/2023 05:32:22 - INFO - __main__ - acc = 0.8302 10/01/2023 05:36:19 - INFO - __main__ - global_step = 17050, average loss = 0.059951672412971675 10/01/2023 05:40:23 - INFO - __main__ - global_step = 17100, average loss = 0.0679468241819086 10/01/2023 05:44:37 - INFO - __main__ - global_step = 17150, average loss = 0.06287542213140114 10/01/2023 05:48:53 - INFO - __main__ - global_step = 17200, average loss = 0.07064101672236575 10/01/2023 05:48:53 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 05:48:53 - INFO - __main__ - Num examples = 10000 10/01/2023 05:48:53 - INFO - __main__ - Batch size = 32 10/01/2023 05:53:11 - INFO - __main__ - ***** Eval results ***** 10/01/2023 05:53:11 - INFO - __main__ - acc = 0.8288 10/01/2023 05:57:08 - INFO - __main__ - global_step = 17250, average loss = 0.06821862254073494 10/01/2023 06:01:07 - INFO - __main__ - global_step = 17300, average loss = 0.06737288911346695 10/01/2023 06:05:09 - INFO - __main__ - global_step = 17350, average loss = 0.06524526451248676 10/01/2023 06:09:17 - INFO - __main__ - global_step = 17400, average loss = 0.06838752188666604 10/01/2023 06:09:17 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 06:09:17 - INFO - __main__ - Num examples = 10000 10/01/2023 06:09:17 - INFO - __main__ - Batch size = 32 10/01/2023 06:13:34 - INFO - __main__ - ***** Eval results ***** 10/01/2023 06:13:34 - INFO - __main__ - acc = 0.8292 10/01/2023 06:17:34 - INFO - __main__ - global_step = 17450, average loss = 0.07033179465208378 10/01/2023 06:21:42 - INFO - __main__ - global_step = 17500, average loss = 0.07338941472058651 10/01/2023 06:25:54 - INFO - __main__ - global_step = 17550, average loss = 0.06760536882744418 10/01/2023 06:30:29 - INFO - __main__ - global_step = 17600, average loss = 0.06395369231896893 10/01/2023 06:30:30 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 06:30:30 - INFO - __main__ - Num examples = 10000 10/01/2023 06:30:30 - INFO - __main__ - Batch size = 32 10/01/2023 06:34:46 - INFO - __main__ - ***** Eval results ***** 10/01/2023 06:34:46 - INFO - __main__ - acc = 0.8226 10/01/2023 06:38:42 - INFO - __main__ - global_step = 17650, average loss = 0.0788995540245378 10/01/2023 06:42:45 - INFO - __main__ - global_step = 17700, average loss = 0.058938835552726235 10/01/2023 06:46:55 - INFO - __main__ - global_step = 17750, average loss = 0.062029462043719834 10/01/2023 06:51:15 - INFO - __main__ - global_step = 17800, average loss = 0.07220558329383493 10/01/2023 06:51:15 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 06:51:15 - INFO - __main__ - Num examples = 10000 10/01/2023 06:51:15 - INFO - __main__ - Batch size = 32 10/01/2023 06:55:33 - INFO - __main__ - ***** Eval results ***** 10/01/2023 06:55:33 - INFO - __main__ - acc = 0.823 10/01/2023 06:59:32 - INFO - __main__ - global_step = 17850, average loss = 0.07046543042039048 10/01/2023 07:03:39 - INFO - __main__ - global_step = 17900, average loss = 0.0620857437804807 10/01/2023 07:07:50 - INFO - __main__ - global_step = 17950, average loss = 0.05406381562563183 10/01/2023 07:12:05 - INFO - __main__ - global_step = 18000, average loss = 0.05979254503792617 10/01/2023 07:12:05 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 07:12:05 - INFO - __main__ - Num examples = 10000 10/01/2023 07:12:05 - INFO - __main__ - Batch size = 32 10/01/2023 07:16:22 - INFO - __main__ - ***** Eval results ***** 10/01/2023 07:16:22 - INFO - __main__ - acc = 0.8237 10/01/2023 07:20:13 - INFO - __main__ - global_step = 18050, average loss = 0.06541542315782863 10/01/2023 07:24:31 - INFO - __main__ - global_step = 18100, average loss = 0.06534778851972078 10/01/2023 07:28:50 - INFO - __main__ - global_step = 18150, average loss = 0.06520377914806887 10/01/2023 07:33:09 - INFO - __main__ - global_step = 18200, average loss = 0.05995443502964917 10/01/2023 07:33:10 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 07:33:10 - INFO - __main__ - Num examples = 10000 10/01/2023 07:33:10 - INFO - __main__ - Batch size = 32 10/01/2023 07:37:27 - INFO - __main__ - ***** Eval results ***** 10/01/2023 07:37:27 - INFO - __main__ - acc = 0.825 10/01/2023 07:41:29 - INFO - __main__ - global_step = 18250, average loss = 0.0729160438424151 10/01/2023 07:45:44 - INFO - __main__ - global_step = 18300, average loss = 0.06983143856698007 10/01/2023 07:48:53 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 07:48:53 - INFO - __main__ - Num examples = 10000 10/01/2023 07:48:53 - INFO - __main__ - Batch size = 32 10/01/2023 07:53:22 - INFO - __main__ - ***** Eval results ***** 10/01/2023 07:53:22 - INFO - __main__ - acc = 0.8249 10/01/2023 07:53:22 - INFO - __main__ - global_step = 18336, average loss = 0.09140925639286196 10/01/2023 07:53:56 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 07:53:56 - INFO - __main__ - Num examples = 10000 10/01/2023 07:53:56 - INFO - __main__ - Batch size = 32 10/01/2023 07:58:24 - INFO - __main__ - ***** Eval results ***** 10/01/2023 07:58:24 - INFO - __main__ - acc = 0.8326 10/01/2023 07:58:30 - INFO - evaluate_DeBERTa - Namespace(dataset_file='../../../data/mcqa/eval/socialiqa_dev.jsonl', lm='output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6', out_dir='./eval_results/deberta-v3-large_car_2i_name_100k_seed_101_5e-6', device=0, reader='socialiqa', overwrite_output_dir=False, cache_dir=None) 10/01/2023 07:58:30 - INFO - evaluate_DeBERTa - Initializing output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 10/01/2023 08:06:13 - INFO - evaluate_DeBERTa - Namespace(dataset_file='../../../data/mcqa/eval/winogrande_dev.jsonl', lm='output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6', out_dir='./eval_results/deberta-v3-large_car_2i_name_100k_seed_101_5e-6', device=0, reader='winogrande', overwrite_output_dir=False, cache_dir=None) 10/01/2023 08:06:13 - INFO - evaluate_DeBERTa - Initializing output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 10/01/2023 08:08:40 - INFO - evaluate_DeBERTa - Namespace(dataset_file='../../../data/mcqa/eval/piqa_dev.jsonl', lm='output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6', out_dir='./eval_results/deberta-v3-large_car_2i_name_100k_seed_101_5e-6', device=0, reader='piqa', overwrite_output_dir=False, cache_dir=None) 10/01/2023 08:08:40 - INFO - evaluate_DeBERTa - Initializing output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 10/01/2023 08:17:19 - INFO - evaluate_DeBERTa - Namespace(dataset_file='../../../data/mcqa/eval/commonsenseqa_dev.jsonl', lm='output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6', out_dir='./eval_results/deberta-v3-large_car_2i_name_100k_seed_101_5e-6', device=0, reader='commonsenseqa', overwrite_output_dir=False, cache_dir=None) 10/01/2023 08:17:19 - INFO - evaluate_DeBERTa - Initializing output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 10/01/2023 08:23:12 - INFO - evaluate_DeBERTa - Namespace(dataset_file='../../../data/mcqa/eval/anli_dev.jsonl', lm='output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6', out_dir='./eval_results/deberta-v3-large_car_2i_name_100k_seed_101_5e-6', device=0, reader='anli', overwrite_output_dir=False, cache_dir=None) 10/01/2023 08:23:12 - INFO - evaluate_DeBERTa - Initializing output/Output_ATOMIC-pseudo-wWC/car_2i/deberta-v3-large_car_2i_name_100k_seed_101_5e-6 10/01/2023 08:28:58 - INFO - __main__ - ***** Running evaluation ***** 10/01/2023 08:28:58 - INFO - __main__ - Num examples = 120 10/01/2023 08:28:58 - INFO - __main__ - Batch size = 32 10/01/2023 08:29:16 - INFO - __main__ - ***** Eval results ***** 10/01/2023 08:29:16 - INFO - __main__ - acc = 0.475