|
2023-10-11 12:17:47,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:17:47,460 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 12:17:47,460 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:17:47,461 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-11 12:17:47,461 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:17:47,461 Train: 1085 sentences |
|
2023-10-11 12:17:47,461 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 12:17:47,461 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:17:47,461 Training Params: |
|
2023-10-11 12:17:47,461 - learning_rate: "0.00016" |
|
2023-10-11 12:17:47,461 - mini_batch_size: "4" |
|
2023-10-11 12:17:47,461 - max_epochs: "10" |
|
2023-10-11 12:17:47,461 - shuffle: "True" |
|
2023-10-11 12:17:47,461 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:17:47,461 Plugins: |
|
2023-10-11 12:17:47,461 - TensorboardLogger |
|
2023-10-11 12:17:47,462 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 12:17:47,462 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:17:47,462 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 12:17:47,462 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 12:17:47,462 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:17:47,462 Computation: |
|
2023-10-11 12:17:47,462 - compute on device: cuda:0 |
|
2023-10-11 12:17:47,462 - embedding storage: none |
|
2023-10-11 12:17:47,462 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:17:47,462 Model training base path: "hmbench-newseye/sv-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-11 12:17:47,462 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:17:47,462 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:17:47,462 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 12:17:57,185 epoch 1 - iter 27/272 - loss 2.83784856 - time (sec): 9.72 - samples/sec: 573.53 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 12:18:07,237 epoch 1 - iter 54/272 - loss 2.83077250 - time (sec): 19.77 - samples/sec: 575.99 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-11 12:18:16,704 epoch 1 - iter 81/272 - loss 2.80770322 - time (sec): 29.24 - samples/sec: 570.04 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 12:18:25,982 epoch 1 - iter 108/272 - loss 2.75243362 - time (sec): 38.52 - samples/sec: 565.04 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 12:18:34,945 epoch 1 - iter 135/272 - loss 2.67455773 - time (sec): 47.48 - samples/sec: 554.80 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-11 12:18:43,921 epoch 1 - iter 162/272 - loss 2.57823913 - time (sec): 56.46 - samples/sec: 550.87 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 12:18:52,759 epoch 1 - iter 189/272 - loss 2.47103511 - time (sec): 65.29 - samples/sec: 548.01 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-11 12:19:02,746 epoch 1 - iter 216/272 - loss 2.34565635 - time (sec): 75.28 - samples/sec: 551.99 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 12:19:11,733 epoch 1 - iter 243/272 - loss 2.22212358 - time (sec): 84.27 - samples/sec: 552.05 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 12:19:20,836 epoch 1 - iter 270/272 - loss 2.09514760 - time (sec): 93.37 - samples/sec: 553.28 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 12:19:21,365 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:19:21,365 EPOCH 1 done: loss 2.0875 - lr: 0.000158 |
|
2023-10-11 12:19:26,097 DEV : loss 0.7258709073066711 - f1-score (micro avg) 0.0 |
|
2023-10-11 12:19:26,106 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:19:35,428 epoch 2 - iter 27/272 - loss 0.70983607 - time (sec): 9.32 - samples/sec: 573.84 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 12:19:44,149 epoch 2 - iter 54/272 - loss 0.68848670 - time (sec): 18.04 - samples/sec: 560.06 - lr: 0.000157 - momentum: 0.000000 |
|
2023-10-11 12:19:52,270 epoch 2 - iter 81/272 - loss 0.66075679 - time (sec): 26.16 - samples/sec: 534.87 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-11 12:20:02,359 epoch 2 - iter 108/272 - loss 0.60795335 - time (sec): 36.25 - samples/sec: 557.87 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-11 12:20:11,747 epoch 2 - iter 135/272 - loss 0.58284167 - time (sec): 45.64 - samples/sec: 552.89 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-11 12:20:21,743 epoch 2 - iter 162/272 - loss 0.53477008 - time (sec): 55.63 - samples/sec: 558.84 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-11 12:20:30,560 epoch 2 - iter 189/272 - loss 0.51131648 - time (sec): 64.45 - samples/sec: 549.06 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 12:20:39,933 epoch 2 - iter 216/272 - loss 0.48449409 - time (sec): 73.83 - samples/sec: 547.01 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-11 12:20:49,031 epoch 2 - iter 243/272 - loss 0.47029300 - time (sec): 82.92 - samples/sec: 544.63 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-11 12:20:59,571 epoch 2 - iter 270/272 - loss 0.45240623 - time (sec): 93.46 - samples/sec: 554.23 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 12:20:59,996 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:20:59,996 EPOCH 2 done: loss 0.4521 - lr: 0.000142 |
|
2023-10-11 12:21:05,525 DEV : loss 0.27614670991897583 - f1-score (micro avg) 0.3235 |
|
2023-10-11 12:21:05,534 saving best model |
|
2023-10-11 12:21:06,411 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:21:15,715 epoch 3 - iter 27/272 - loss 0.23309089 - time (sec): 9.30 - samples/sec: 548.35 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-11 12:21:25,472 epoch 3 - iter 54/272 - loss 0.27485890 - time (sec): 19.06 - samples/sec: 567.71 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 12:21:34,737 epoch 3 - iter 81/272 - loss 0.27012296 - time (sec): 28.32 - samples/sec: 569.76 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 12:21:43,893 epoch 3 - iter 108/272 - loss 0.26806557 - time (sec): 37.48 - samples/sec: 560.35 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 12:21:53,166 epoch 3 - iter 135/272 - loss 0.26644863 - time (sec): 46.75 - samples/sec: 560.30 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-11 12:22:02,417 epoch 3 - iter 162/272 - loss 0.27294415 - time (sec): 56.00 - samples/sec: 559.74 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 12:22:12,036 epoch 3 - iter 189/272 - loss 0.26862119 - time (sec): 65.62 - samples/sec: 563.37 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 12:22:21,344 epoch 3 - iter 216/272 - loss 0.26282089 - time (sec): 74.93 - samples/sec: 559.23 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 12:22:30,546 epoch 3 - iter 243/272 - loss 0.26662864 - time (sec): 84.13 - samples/sec: 556.68 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 12:22:39,525 epoch 3 - iter 270/272 - loss 0.26058992 - time (sec): 93.11 - samples/sec: 554.85 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 12:22:40,069 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:22:40,069 EPOCH 3 done: loss 0.2592 - lr: 0.000125 |
|
2023-10-11 12:22:45,596 DEV : loss 0.19185124337673187 - f1-score (micro avg) 0.5766 |
|
2023-10-11 12:22:45,604 saving best model |
|
2023-10-11 12:22:48,091 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:22:57,047 epoch 4 - iter 27/272 - loss 0.17559727 - time (sec): 8.95 - samples/sec: 544.28 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 12:23:06,445 epoch 4 - iter 54/272 - loss 0.14763057 - time (sec): 18.35 - samples/sec: 564.09 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 12:23:16,156 epoch 4 - iter 81/272 - loss 0.16759112 - time (sec): 28.06 - samples/sec: 578.45 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 12:23:25,501 epoch 4 - iter 108/272 - loss 0.16355731 - time (sec): 37.41 - samples/sec: 574.32 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 12:23:35,043 epoch 4 - iter 135/272 - loss 0.15538460 - time (sec): 46.95 - samples/sec: 575.87 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-11 12:23:43,666 epoch 4 - iter 162/272 - loss 0.15731211 - time (sec): 55.57 - samples/sec: 567.27 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-11 12:23:53,322 epoch 4 - iter 189/272 - loss 0.15895911 - time (sec): 65.23 - samples/sec: 569.26 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 12:24:02,161 epoch 4 - iter 216/272 - loss 0.15623754 - time (sec): 74.07 - samples/sec: 564.59 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 12:24:10,962 epoch 4 - iter 243/272 - loss 0.15352344 - time (sec): 82.87 - samples/sec: 560.85 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 12:24:20,498 epoch 4 - iter 270/272 - loss 0.15512637 - time (sec): 92.40 - samples/sec: 560.35 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 12:24:20,943 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:24:20,943 EPOCH 4 done: loss 0.1547 - lr: 0.000107 |
|
2023-10-11 12:24:26,384 DEV : loss 0.15080419182777405 - f1-score (micro avg) 0.6617 |
|
2023-10-11 12:24:26,392 saving best model |
|
2023-10-11 12:24:28,921 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:24:38,560 epoch 5 - iter 27/272 - loss 0.11298860 - time (sec): 9.63 - samples/sec: 581.63 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 12:24:47,611 epoch 5 - iter 54/272 - loss 0.11828460 - time (sec): 18.69 - samples/sec: 562.67 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-11 12:24:57,149 epoch 5 - iter 81/272 - loss 0.12324494 - time (sec): 28.22 - samples/sec: 576.89 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-11 12:25:06,969 epoch 5 - iter 108/272 - loss 0.11225919 - time (sec): 38.04 - samples/sec: 581.49 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 12:25:15,898 epoch 5 - iter 135/272 - loss 0.10889944 - time (sec): 46.97 - samples/sec: 577.04 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 12:25:24,717 epoch 5 - iter 162/272 - loss 0.10759412 - time (sec): 55.79 - samples/sec: 569.39 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-11 12:25:33,490 epoch 5 - iter 189/272 - loss 0.10450328 - time (sec): 64.56 - samples/sec: 563.78 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-11 12:25:42,652 epoch 5 - iter 216/272 - loss 0.10251942 - time (sec): 73.73 - samples/sec: 564.22 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 12:25:51,924 epoch 5 - iter 243/272 - loss 0.10536311 - time (sec): 83.00 - samples/sec: 564.83 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 12:26:00,818 epoch 5 - iter 270/272 - loss 0.10201182 - time (sec): 91.89 - samples/sec: 561.89 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 12:26:01,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:26:01,388 EPOCH 5 done: loss 0.1023 - lr: 0.000089 |
|
2023-10-11 12:26:06,848 DEV : loss 0.1377904713153839 - f1-score (micro avg) 0.6462 |
|
2023-10-11 12:26:06,856 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:26:16,012 epoch 6 - iter 27/272 - loss 0.06858667 - time (sec): 9.15 - samples/sec: 560.28 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 12:26:25,347 epoch 6 - iter 54/272 - loss 0.07270870 - time (sec): 18.49 - samples/sec: 552.38 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 12:26:35,819 epoch 6 - iter 81/272 - loss 0.07138524 - time (sec): 28.96 - samples/sec: 571.95 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 12:26:44,751 epoch 6 - iter 108/272 - loss 0.07082171 - time (sec): 37.89 - samples/sec: 558.81 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 12:26:53,646 epoch 6 - iter 135/272 - loss 0.06942375 - time (sec): 46.79 - samples/sec: 554.10 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 12:27:02,878 epoch 6 - iter 162/272 - loss 0.06931374 - time (sec): 56.02 - samples/sec: 555.19 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 12:27:11,743 epoch 6 - iter 189/272 - loss 0.07471804 - time (sec): 64.89 - samples/sec: 551.02 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 12:27:21,147 epoch 6 - iter 216/272 - loss 0.07358616 - time (sec): 74.29 - samples/sec: 551.36 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 12:27:31,007 epoch 6 - iter 243/272 - loss 0.07385375 - time (sec): 84.15 - samples/sec: 554.53 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-11 12:27:40,226 epoch 6 - iter 270/272 - loss 0.07393521 - time (sec): 93.37 - samples/sec: 554.54 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-11 12:27:40,629 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:27:40,630 EPOCH 6 done: loss 0.0737 - lr: 0.000071 |
|
2023-10-11 12:27:46,135 DEV : loss 0.13831757009029388 - f1-score (micro avg) 0.7681 |
|
2023-10-11 12:27:46,142 saving best model |
|
2023-10-11 12:27:48,649 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:27:57,510 epoch 7 - iter 27/272 - loss 0.05129014 - time (sec): 8.86 - samples/sec: 549.17 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 12:28:06,049 epoch 7 - iter 54/272 - loss 0.06171732 - time (sec): 17.40 - samples/sec: 535.80 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 12:28:14,620 epoch 7 - iter 81/272 - loss 0.05770185 - time (sec): 25.97 - samples/sec: 533.83 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-11 12:28:23,980 epoch 7 - iter 108/272 - loss 0.05578072 - time (sec): 35.33 - samples/sec: 540.64 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 12:28:33,686 epoch 7 - iter 135/272 - loss 0.05183090 - time (sec): 45.03 - samples/sec: 548.95 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 12:28:43,514 epoch 7 - iter 162/272 - loss 0.05034293 - time (sec): 54.86 - samples/sec: 556.20 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-11 12:28:52,842 epoch 7 - iter 189/272 - loss 0.05522158 - time (sec): 64.19 - samples/sec: 553.57 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 12:29:01,292 epoch 7 - iter 216/272 - loss 0.05317533 - time (sec): 72.64 - samples/sec: 547.22 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 12:29:11,075 epoch 7 - iter 243/272 - loss 0.05369366 - time (sec): 82.42 - samples/sec: 554.67 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 12:29:21,164 epoch 7 - iter 270/272 - loss 0.05345304 - time (sec): 92.51 - samples/sec: 560.18 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-11 12:29:21,547 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:29:21,548 EPOCH 7 done: loss 0.0535 - lr: 0.000054 |
|
2023-10-11 12:29:27,021 DEV : loss 0.14701178669929504 - f1-score (micro avg) 0.7956 |
|
2023-10-11 12:29:27,029 saving best model |
|
2023-10-11 12:29:29,508 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:29:38,586 epoch 8 - iter 27/272 - loss 0.02662217 - time (sec): 9.07 - samples/sec: 567.16 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 12:29:48,267 epoch 8 - iter 54/272 - loss 0.04238572 - time (sec): 18.76 - samples/sec: 584.96 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 12:29:57,482 epoch 8 - iter 81/272 - loss 0.04066595 - time (sec): 27.97 - samples/sec: 578.41 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 12:30:06,552 epoch 8 - iter 108/272 - loss 0.03956566 - time (sec): 37.04 - samples/sec: 565.28 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-11 12:30:16,114 epoch 8 - iter 135/272 - loss 0.03767651 - time (sec): 46.60 - samples/sec: 567.35 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 12:30:26,166 epoch 8 - iter 162/272 - loss 0.04051798 - time (sec): 56.65 - samples/sec: 577.03 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-11 12:30:34,685 epoch 8 - iter 189/272 - loss 0.04409632 - time (sec): 65.17 - samples/sec: 565.84 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-11 12:30:44,286 epoch 8 - iter 216/272 - loss 0.04433015 - time (sec): 74.77 - samples/sec: 566.86 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 12:30:53,159 epoch 8 - iter 243/272 - loss 0.04347717 - time (sec): 83.65 - samples/sec: 563.86 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-11 12:31:02,010 epoch 8 - iter 270/272 - loss 0.04284592 - time (sec): 92.50 - samples/sec: 559.27 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-11 12:31:02,455 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:31:02,455 EPOCH 8 done: loss 0.0426 - lr: 0.000036 |
|
2023-10-11 12:31:07,906 DEV : loss 0.14679642021656036 - f1-score (micro avg) 0.8015 |
|
2023-10-11 12:31:07,914 saving best model |
|
2023-10-11 12:31:10,409 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:31:19,316 epoch 9 - iter 27/272 - loss 0.03044480 - time (sec): 8.90 - samples/sec: 549.94 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 12:31:27,617 epoch 9 - iter 54/272 - loss 0.04315447 - time (sec): 17.20 - samples/sec: 538.76 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 12:31:36,320 epoch 9 - iter 81/272 - loss 0.03877733 - time (sec): 25.91 - samples/sec: 540.85 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 12:31:45,762 epoch 9 - iter 108/272 - loss 0.03541826 - time (sec): 35.35 - samples/sec: 553.43 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 12:31:55,099 epoch 9 - iter 135/272 - loss 0.03420976 - time (sec): 44.69 - samples/sec: 559.43 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 12:32:04,049 epoch 9 - iter 162/272 - loss 0.03216315 - time (sec): 53.64 - samples/sec: 558.43 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 12:32:13,330 epoch 9 - iter 189/272 - loss 0.03081178 - time (sec): 62.92 - samples/sec: 558.94 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 12:32:23,043 epoch 9 - iter 216/272 - loss 0.03134856 - time (sec): 72.63 - samples/sec: 564.70 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-11 12:32:32,028 epoch 9 - iter 243/272 - loss 0.03443496 - time (sec): 81.61 - samples/sec: 562.97 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 12:32:41,604 epoch 9 - iter 270/272 - loss 0.03386938 - time (sec): 91.19 - samples/sec: 567.00 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-11 12:32:42,083 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:32:42,083 EPOCH 9 done: loss 0.0337 - lr: 0.000018 |
|
2023-10-11 12:32:47,749 DEV : loss 0.149958074092865 - f1-score (micro avg) 0.8096 |
|
2023-10-11 12:32:47,757 saving best model |
|
2023-10-11 12:32:50,259 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:32:59,526 epoch 10 - iter 27/272 - loss 0.03830851 - time (sec): 9.26 - samples/sec: 576.23 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 12:33:08,503 epoch 10 - iter 54/272 - loss 0.03546184 - time (sec): 18.24 - samples/sec: 571.99 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 12:33:17,103 epoch 10 - iter 81/272 - loss 0.03145260 - time (sec): 26.84 - samples/sec: 562.54 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 12:33:26,206 epoch 10 - iter 108/272 - loss 0.03259123 - time (sec): 35.94 - samples/sec: 557.04 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-11 12:33:36,463 epoch 10 - iter 135/272 - loss 0.02974927 - time (sec): 46.20 - samples/sec: 570.11 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-11 12:33:45,171 epoch 10 - iter 162/272 - loss 0.02889802 - time (sec): 54.91 - samples/sec: 558.86 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 12:33:54,983 epoch 10 - iter 189/272 - loss 0.02806859 - time (sec): 64.72 - samples/sec: 559.46 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 12:34:04,346 epoch 10 - iter 216/272 - loss 0.02816137 - time (sec): 74.08 - samples/sec: 557.21 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-11 12:34:13,684 epoch 10 - iter 243/272 - loss 0.02889036 - time (sec): 83.42 - samples/sec: 558.00 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 12:34:23,246 epoch 10 - iter 270/272 - loss 0.02974897 - time (sec): 92.98 - samples/sec: 556.61 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 12:34:23,713 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:34:23,713 EPOCH 10 done: loss 0.0297 - lr: 0.000000 |
|
2023-10-11 12:34:29,409 DEV : loss 0.15252262353897095 - f1-score (micro avg) 0.8051 |
|
2023-10-11 12:34:30,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 12:34:30,256 Loading model from best epoch ... |
|
2023-10-11 12:34:34,051 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-11 12:34:46,395 |
|
Results: |
|
- F-score (micro) 0.7562 |
|
- F-score (macro) 0.6778 |
|
- Accuracy 0.6276 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7437 0.8558 0.7958 312 |
|
PER 0.7287 0.8654 0.7912 208 |
|
ORG 0.3793 0.4000 0.3894 55 |
|
HumanProd 0.6667 0.8182 0.7347 22 |
|
|
|
micro avg 0.7048 0.8157 0.7562 597 |
|
macro avg 0.6296 0.7348 0.6778 597 |
|
weighted avg 0.7021 0.8157 0.7545 597 |
|
|
|
2023-10-11 12:34:46,395 ---------------------------------------------------------------------------------------------------- |
|
|