stefan-it commited on
Commit
515f8ce
1 Parent(s): c7742b7

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +258 -0
training.log ADDED
@@ -0,0 +1,258 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-19 00:16:41,198 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-19 00:16:41,199 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(31103, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=81, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-19 00:16:41,199 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-19 00:16:41,199 Corpus: 6900 train + 1576 dev + 1833 test sentences
52
+ 2023-10-19 00:16:41,199 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-19 00:16:41,199 Train: 6900 sentences
54
+ 2023-10-19 00:16:41,200 (train_with_dev=False, train_with_test=False)
55
+ 2023-10-19 00:16:41,200 ----------------------------------------------------------------------------------------------------
56
+ 2023-10-19 00:16:41,200 Training Params:
57
+ 2023-10-19 00:16:41,200 - learning_rate: "3e-05"
58
+ 2023-10-19 00:16:41,200 - mini_batch_size: "16"
59
+ 2023-10-19 00:16:41,200 - max_epochs: "10"
60
+ 2023-10-19 00:16:41,200 - shuffle: "True"
61
+ 2023-10-19 00:16:41,200 ----------------------------------------------------------------------------------------------------
62
+ 2023-10-19 00:16:41,200 Plugins:
63
+ 2023-10-19 00:16:41,200 - TensorboardLogger
64
+ 2023-10-19 00:16:41,200 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-19 00:16:41,200 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-19 00:16:41,200 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-19 00:16:41,200 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-19 00:16:41,200 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-19 00:16:41,200 Computation:
70
+ 2023-10-19 00:16:41,200 - compute on device: cuda:0
71
+ 2023-10-19 00:16:41,200 - embedding storage: none
72
+ 2023-10-19 00:16:41,200 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-19 00:16:41,200 Model training base path: "autotrain-flair-mobie-gbert_base-bs16-e10-lr3e-05-2"
74
+ 2023-10-19 00:16:41,201 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-19 00:16:41,201 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-19 00:16:41,201 Logging anything other than scalars to TensorBoard is currently not supported.
77
+ 2023-10-19 00:16:56,550 epoch 1 - iter 43/432 - loss 4.74339679 - time (sec): 15.35 - samples/sec: 393.65 - lr: 0.000003 - momentum: 0.000000
78
+ 2023-10-19 00:17:10,707 epoch 1 - iter 86/432 - loss 3.69280224 - time (sec): 29.51 - samples/sec: 416.26 - lr: 0.000006 - momentum: 0.000000
79
+ 2023-10-19 00:17:25,377 epoch 1 - iter 129/432 - loss 3.11907087 - time (sec): 44.18 - samples/sec: 412.84 - lr: 0.000009 - momentum: 0.000000
80
+ 2023-10-19 00:17:39,369 epoch 1 - iter 172/432 - loss 2.79415985 - time (sec): 58.17 - samples/sec: 418.02 - lr: 0.000012 - momentum: 0.000000
81
+ 2023-10-19 00:17:54,435 epoch 1 - iter 215/432 - loss 2.57050206 - time (sec): 73.23 - samples/sec: 411.49 - lr: 0.000015 - momentum: 0.000000
82
+ 2023-10-19 00:18:09,813 epoch 1 - iter 258/432 - loss 2.33799933 - time (sec): 88.61 - samples/sec: 413.95 - lr: 0.000018 - momentum: 0.000000
83
+ 2023-10-19 00:18:24,648 epoch 1 - iter 301/432 - loss 2.16172499 - time (sec): 103.45 - samples/sec: 414.53 - lr: 0.000021 - momentum: 0.000000
84
+ 2023-10-19 00:18:39,760 epoch 1 - iter 344/432 - loss 2.01712724 - time (sec): 118.56 - samples/sec: 413.49 - lr: 0.000024 - momentum: 0.000000
85
+ 2023-10-19 00:18:54,151 epoch 1 - iter 387/432 - loss 1.88891723 - time (sec): 132.95 - samples/sec: 413.69 - lr: 0.000027 - momentum: 0.000000
86
+ 2023-10-19 00:19:09,750 epoch 1 - iter 430/432 - loss 1.76466676 - time (sec): 148.55 - samples/sec: 414.43 - lr: 0.000030 - momentum: 0.000000
87
+ 2023-10-19 00:19:10,345 ----------------------------------------------------------------------------------------------------
88
+ 2023-10-19 00:19:10,346 EPOCH 1 done: loss 1.7605 - lr: 0.000030
89
+ 2023-10-19 00:19:23,881 DEV : loss 0.5642062425613403 - f1-score (micro avg) 0.6351
90
+ 2023-10-19 00:19:23,905 saving best model
91
+ 2023-10-19 00:19:24,363 ----------------------------------------------------------------------------------------------------
92
+ 2023-10-19 00:19:38,473 epoch 2 - iter 43/432 - loss 0.63277954 - time (sec): 14.11 - samples/sec: 460.79 - lr: 0.000030 - momentum: 0.000000
93
+ 2023-10-19 00:19:52,843 epoch 2 - iter 86/432 - loss 0.58715243 - time (sec): 28.48 - samples/sec: 447.57 - lr: 0.000029 - momentum: 0.000000
94
+ 2023-10-19 00:20:08,079 epoch 2 - iter 129/432 - loss 0.57597451 - time (sec): 43.71 - samples/sec: 428.05 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-19 00:20:23,330 epoch 2 - iter 172/432 - loss 0.55894939 - time (sec): 58.97 - samples/sec: 424.51 - lr: 0.000029 - momentum: 0.000000
96
+ 2023-10-19 00:20:38,457 epoch 2 - iter 215/432 - loss 0.54682286 - time (sec): 74.09 - samples/sec: 418.14 - lr: 0.000028 - momentum: 0.000000
97
+ 2023-10-19 00:20:53,042 epoch 2 - iter 258/432 - loss 0.53151093 - time (sec): 88.68 - samples/sec: 418.66 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-19 00:21:07,908 epoch 2 - iter 301/432 - loss 0.51887975 - time (sec): 103.54 - samples/sec: 414.79 - lr: 0.000028 - momentum: 0.000000
99
+ 2023-10-19 00:21:22,719 epoch 2 - iter 344/432 - loss 0.51217919 - time (sec): 118.35 - samples/sec: 414.72 - lr: 0.000027 - momentum: 0.000000
100
+ 2023-10-19 00:21:37,756 epoch 2 - iter 387/432 - loss 0.49869397 - time (sec): 133.39 - samples/sec: 414.52 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-19 00:21:52,045 epoch 2 - iter 430/432 - loss 0.49068478 - time (sec): 147.68 - samples/sec: 417.04 - lr: 0.000027 - momentum: 0.000000
102
+ 2023-10-19 00:21:52,617 ----------------------------------------------------------------------------------------------------
103
+ 2023-10-19 00:21:52,617 EPOCH 2 done: loss 0.4907 - lr: 0.000027
104
+ 2023-10-19 00:22:05,834 DEV : loss 0.346346378326416 - f1-score (micro avg) 0.7698
105
+ 2023-10-19 00:22:05,858 saving best model
106
+ 2023-10-19 00:22:07,095 ----------------------------------------------------------------------------------------------------
107
+ 2023-10-19 00:22:21,551 epoch 3 - iter 43/432 - loss 0.34282667 - time (sec): 14.45 - samples/sec: 406.92 - lr: 0.000026 - momentum: 0.000000
108
+ 2023-10-19 00:22:36,270 epoch 3 - iter 86/432 - loss 0.33200713 - time (sec): 29.17 - samples/sec: 403.75 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-19 00:22:52,022 epoch 3 - iter 129/432 - loss 0.32080282 - time (sec): 44.93 - samples/sec: 396.68 - lr: 0.000026 - momentum: 0.000000
110
+ 2023-10-19 00:23:07,641 epoch 3 - iter 172/432 - loss 0.31771938 - time (sec): 60.54 - samples/sec: 393.81 - lr: 0.000025 - momentum: 0.000000
111
+ 2023-10-19 00:23:22,177 epoch 3 - iter 215/432 - loss 0.31302256 - time (sec): 75.08 - samples/sec: 400.77 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-19 00:23:36,579 epoch 3 - iter 258/432 - loss 0.31007470 - time (sec): 89.48 - samples/sec: 407.35 - lr: 0.000025 - momentum: 0.000000
113
+ 2023-10-19 00:23:50,695 epoch 3 - iter 301/432 - loss 0.30555352 - time (sec): 103.60 - samples/sec: 412.23 - lr: 0.000024 - momentum: 0.000000
114
+ 2023-10-19 00:24:06,473 epoch 3 - iter 344/432 - loss 0.29928908 - time (sec): 119.38 - samples/sec: 411.36 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-19 00:24:20,745 epoch 3 - iter 387/432 - loss 0.29428176 - time (sec): 133.65 - samples/sec: 412.12 - lr: 0.000024 - momentum: 0.000000
116
+ 2023-10-19 00:24:35,460 epoch 3 - iter 430/432 - loss 0.29672467 - time (sec): 148.36 - samples/sec: 415.78 - lr: 0.000023 - momentum: 0.000000
117
+ 2023-10-19 00:24:35,971 ----------------------------------------------------------------------------------------------------
118
+ 2023-10-19 00:24:35,971 EPOCH 3 done: loss 0.2967 - lr: 0.000023
119
+ 2023-10-19 00:24:49,187 DEV : loss 0.31504037976264954 - f1-score (micro avg) 0.7963
120
+ 2023-10-19 00:24:49,211 saving best model
121
+ 2023-10-19 00:24:50,467 ----------------------------------------------------------------------------------------------------
122
+ 2023-10-19 00:25:05,412 epoch 4 - iter 43/432 - loss 0.20335532 - time (sec): 14.94 - samples/sec: 420.97 - lr: 0.000023 - momentum: 0.000000
123
+ 2023-10-19 00:25:20,061 epoch 4 - iter 86/432 - loss 0.22350075 - time (sec): 29.59 - samples/sec: 420.23 - lr: 0.000023 - momentum: 0.000000
124
+ 2023-10-19 00:25:35,172 epoch 4 - iter 129/432 - loss 0.22388534 - time (sec): 44.70 - samples/sec: 421.31 - lr: 0.000022 - momentum: 0.000000
125
+ 2023-10-19 00:25:49,750 epoch 4 - iter 172/432 - loss 0.22617688 - time (sec): 59.28 - samples/sec: 417.61 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-19 00:26:05,215 epoch 4 - iter 215/432 - loss 0.21971434 - time (sec): 74.75 - samples/sec: 414.52 - lr: 0.000022 - momentum: 0.000000
127
+ 2023-10-19 00:26:20,389 epoch 4 - iter 258/432 - loss 0.22117839 - time (sec): 89.92 - samples/sec: 414.89 - lr: 0.000021 - momentum: 0.000000
128
+ 2023-10-19 00:26:35,115 epoch 4 - iter 301/432 - loss 0.22265378 - time (sec): 104.65 - samples/sec: 416.69 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-19 00:26:49,810 epoch 4 - iter 344/432 - loss 0.22151182 - time (sec): 119.34 - samples/sec: 416.37 - lr: 0.000021 - momentum: 0.000000
130
+ 2023-10-19 00:27:03,535 epoch 4 - iter 387/432 - loss 0.21789655 - time (sec): 133.07 - samples/sec: 419.09 - lr: 0.000020 - momentum: 0.000000
131
+ 2023-10-19 00:27:18,579 epoch 4 - iter 430/432 - loss 0.21456196 - time (sec): 148.11 - samples/sec: 416.70 - lr: 0.000020 - momentum: 0.000000
132
+ 2023-10-19 00:27:19,241 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-19 00:27:19,242 EPOCH 4 done: loss 0.2144 - lr: 0.000020
134
+ 2023-10-19 00:27:32,562 DEV : loss 0.3091279864311218 - f1-score (micro avg) 0.8213
135
+ 2023-10-19 00:27:32,590 saving best model
136
+ 2023-10-19 00:27:34,992 ----------------------------------------------------------------------------------------------------
137
+ 2023-10-19 00:27:49,017 epoch 5 - iter 43/432 - loss 0.17866192 - time (sec): 14.02 - samples/sec: 435.90 - lr: 0.000020 - momentum: 0.000000
138
+ 2023-10-19 00:28:03,936 epoch 5 - iter 86/432 - loss 0.17512337 - time (sec): 28.94 - samples/sec: 420.14 - lr: 0.000019 - momentum: 0.000000
139
+ 2023-10-19 00:28:19,866 epoch 5 - iter 129/432 - loss 0.16824137 - time (sec): 44.87 - samples/sec: 407.60 - lr: 0.000019 - momentum: 0.000000
140
+ 2023-10-19 00:28:34,676 epoch 5 - iter 172/432 - loss 0.16860082 - time (sec): 59.68 - samples/sec: 411.83 - lr: 0.000019 - momentum: 0.000000
141
+ 2023-10-19 00:28:49,495 epoch 5 - iter 215/432 - loss 0.16529930 - time (sec): 74.50 - samples/sec: 410.46 - lr: 0.000018 - momentum: 0.000000
142
+ 2023-10-19 00:29:03,952 epoch 5 - iter 258/432 - loss 0.16770005 - time (sec): 88.96 - samples/sec: 417.59 - lr: 0.000018 - momentum: 0.000000
143
+ 2023-10-19 00:29:18,790 epoch 5 - iter 301/432 - loss 0.16640170 - time (sec): 103.80 - samples/sec: 416.69 - lr: 0.000018 - momentum: 0.000000
144
+ 2023-10-19 00:29:33,385 epoch 5 - iter 344/432 - loss 0.16427645 - time (sec): 118.39 - samples/sec: 418.21 - lr: 0.000017 - momentum: 0.000000
145
+ 2023-10-19 00:29:48,488 epoch 5 - iter 387/432 - loss 0.16540814 - time (sec): 133.50 - samples/sec: 416.79 - lr: 0.000017 - momentum: 0.000000
146
+ 2023-10-19 00:30:02,971 epoch 5 - iter 430/432 - loss 0.16431132 - time (sec): 147.98 - samples/sec: 416.45 - lr: 0.000017 - momentum: 0.000000
147
+ 2023-10-19 00:30:03,520 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-19 00:30:03,521 EPOCH 5 done: loss 0.1640 - lr: 0.000017
149
+ 2023-10-19 00:30:16,827 DEV : loss 0.3160895109176636 - f1-score (micro avg) 0.8377
150
+ 2023-10-19 00:30:16,851 saving best model
151
+ 2023-10-19 00:30:18,089 ----------------------------------------------------------------------------------------------------
152
+ 2023-10-19 00:30:33,807 epoch 6 - iter 43/432 - loss 0.11962951 - time (sec): 15.72 - samples/sec: 392.96 - lr: 0.000016 - momentum: 0.000000
153
+ 2023-10-19 00:30:48,631 epoch 6 - iter 86/432 - loss 0.11884539 - time (sec): 30.54 - samples/sec: 397.61 - lr: 0.000016 - momentum: 0.000000
154
+ 2023-10-19 00:31:03,258 epoch 6 - iter 129/432 - loss 0.12228484 - time (sec): 45.17 - samples/sec: 400.22 - lr: 0.000016 - momentum: 0.000000
155
+ 2023-10-19 00:31:17,032 epoch 6 - iter 172/432 - loss 0.12474293 - time (sec): 58.94 - samples/sec: 414.07 - lr: 0.000015 - momentum: 0.000000
156
+ 2023-10-19 00:31:31,618 epoch 6 - iter 215/432 - loss 0.12867530 - time (sec): 73.53 - samples/sec: 415.73 - lr: 0.000015 - momentum: 0.000000
157
+ 2023-10-19 00:31:46,035 epoch 6 - iter 258/432 - loss 0.12946820 - time (sec): 87.94 - samples/sec: 419.70 - lr: 0.000015 - momentum: 0.000000
158
+ 2023-10-19 00:32:01,941 epoch 6 - iter 301/432 - loss 0.13096586 - time (sec): 103.85 - samples/sec: 416.74 - lr: 0.000014 - momentum: 0.000000
159
+ 2023-10-19 00:32:16,804 epoch 6 - iter 344/432 - loss 0.12989295 - time (sec): 118.71 - samples/sec: 415.91 - lr: 0.000014 - momentum: 0.000000
160
+ 2023-10-19 00:32:31,446 epoch 6 - iter 387/432 - loss 0.12982969 - time (sec): 133.35 - samples/sec: 414.74 - lr: 0.000014 - momentum: 0.000000
161
+ 2023-10-19 00:32:46,889 epoch 6 - iter 430/432 - loss 0.12896680 - time (sec): 148.80 - samples/sec: 414.57 - lr: 0.000013 - momentum: 0.000000
162
+ 2023-10-19 00:32:47,622 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-19 00:32:47,623 EPOCH 6 done: loss 0.1289 - lr: 0.000013
164
+ 2023-10-19 00:33:00,960 DEV : loss 0.33405086398124695 - f1-score (micro avg) 0.8325
165
+ 2023-10-19 00:33:00,984 ----------------------------------------------------------------------------------------------------
166
+ 2023-10-19 00:33:14,708 epoch 7 - iter 43/432 - loss 0.09581046 - time (sec): 13.72 - samples/sec: 465.09 - lr: 0.000013 - momentum: 0.000000
167
+ 2023-10-19 00:33:29,254 epoch 7 - iter 86/432 - loss 0.09800936 - time (sec): 28.27 - samples/sec: 441.95 - lr: 0.000013 - momentum: 0.000000
168
+ 2023-10-19 00:33:44,617 epoch 7 - iter 129/432 - loss 0.09725239 - time (sec): 43.63 - samples/sec: 419.20 - lr: 0.000012 - momentum: 0.000000
169
+ 2023-10-19 00:34:00,001 epoch 7 - iter 172/432 - loss 0.09688733 - time (sec): 59.02 - samples/sec: 416.05 - lr: 0.000012 - momentum: 0.000000
170
+ 2023-10-19 00:34:14,468 epoch 7 - iter 215/432 - loss 0.10137512 - time (sec): 73.48 - samples/sec: 420.10 - lr: 0.000012 - momentum: 0.000000
171
+ 2023-10-19 00:34:29,569 epoch 7 - iter 258/432 - loss 0.10565885 - time (sec): 88.58 - samples/sec: 418.11 - lr: 0.000011 - momentum: 0.000000
172
+ 2023-10-19 00:34:44,754 epoch 7 - iter 301/432 - loss 0.10530649 - time (sec): 103.77 - samples/sec: 416.15 - lr: 0.000011 - momentum: 0.000000
173
+ 2023-10-19 00:34:59,759 epoch 7 - iter 344/432 - loss 0.10536216 - time (sec): 118.77 - samples/sec: 414.76 - lr: 0.000011 - momentum: 0.000000
174
+ 2023-10-19 00:35:15,545 epoch 7 - iter 387/432 - loss 0.10554671 - time (sec): 134.56 - samples/sec: 412.00 - lr: 0.000010 - momentum: 0.000000
175
+ 2023-10-19 00:35:31,052 epoch 7 - iter 430/432 - loss 0.10598819 - time (sec): 150.07 - samples/sec: 411.01 - lr: 0.000010 - momentum: 0.000000
176
+ 2023-10-19 00:35:31,863 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-19 00:35:31,863 EPOCH 7 done: loss 0.1058 - lr: 0.000010
178
+ 2023-10-19 00:35:45,283 DEV : loss 0.351141095161438 - f1-score (micro avg) 0.831
179
+ 2023-10-19 00:35:45,307 ----------------------------------------------------------------------------------------------------
180
+ 2023-10-19 00:35:59,848 epoch 8 - iter 43/432 - loss 0.07709807 - time (sec): 14.54 - samples/sec: 410.88 - lr: 0.000010 - momentum: 0.000000
181
+ 2023-10-19 00:36:14,941 epoch 8 - iter 86/432 - loss 0.08407471 - time (sec): 29.63 - samples/sec: 424.33 - lr: 0.000009 - momentum: 0.000000
182
+ 2023-10-19 00:36:29,079 epoch 8 - iter 129/432 - loss 0.07999086 - time (sec): 43.77 - samples/sec: 416.05 - lr: 0.000009 - momentum: 0.000000
183
+ 2023-10-19 00:36:42,973 epoch 8 - iter 172/432 - loss 0.08291254 - time (sec): 57.66 - samples/sec: 434.37 - lr: 0.000009 - momentum: 0.000000
184
+ 2023-10-19 00:36:57,433 epoch 8 - iter 215/432 - loss 0.08343817 - time (sec): 72.12 - samples/sec: 436.23 - lr: 0.000008 - momentum: 0.000000
185
+ 2023-10-19 00:37:12,605 epoch 8 - iter 258/432 - loss 0.08221956 - time (sec): 87.30 - samples/sec: 429.21 - lr: 0.000008 - momentum: 0.000000
186
+ 2023-10-19 00:37:27,858 epoch 8 - iter 301/432 - loss 0.08082516 - time (sec): 102.55 - samples/sec: 426.54 - lr: 0.000008 - momentum: 0.000000
187
+ 2023-10-19 00:37:42,508 epoch 8 - iter 344/432 - loss 0.08300870 - time (sec): 117.20 - samples/sec: 427.00 - lr: 0.000007 - momentum: 0.000000
188
+ 2023-10-19 00:37:57,378 epoch 8 - iter 387/432 - loss 0.08417221 - time (sec): 132.07 - samples/sec: 422.73 - lr: 0.000007 - momentum: 0.000000
189
+ 2023-10-19 00:38:12,945 epoch 8 - iter 430/432 - loss 0.08447959 - time (sec): 147.64 - samples/sec: 417.49 - lr: 0.000007 - momentum: 0.000000
190
+ 2023-10-19 00:38:13,806 ----------------------------------------------------------------------------------------------------
191
+ 2023-10-19 00:38:13,806 EPOCH 8 done: loss 0.0846 - lr: 0.000007
192
+ 2023-10-19 00:38:27,128 DEV : loss 0.35482051968574524 - f1-score (micro avg) 0.8445
193
+ 2023-10-19 00:38:27,153 saving best model
194
+ 2023-10-19 00:38:28,401 ----------------------------------------------------------------------------------------------------
195
+ 2023-10-19 00:38:42,957 epoch 9 - iter 43/432 - loss 0.07970238 - time (sec): 14.55 - samples/sec: 410.87 - lr: 0.000006 - momentum: 0.000000
196
+ 2023-10-19 00:38:57,617 epoch 9 - iter 86/432 - loss 0.06637511 - time (sec): 29.21 - samples/sec: 426.50 - lr: 0.000006 - momentum: 0.000000
197
+ 2023-10-19 00:39:13,050 epoch 9 - iter 129/432 - loss 0.06582245 - time (sec): 44.65 - samples/sec: 415.68 - lr: 0.000006 - momentum: 0.000000
198
+ 2023-10-19 00:39:29,031 epoch 9 - iter 172/432 - loss 0.06680679 - time (sec): 60.63 - samples/sec: 400.85 - lr: 0.000005 - momentum: 0.000000
199
+ 2023-10-19 00:39:43,689 epoch 9 - iter 215/432 - loss 0.06514411 - time (sec): 75.29 - samples/sec: 410.03 - lr: 0.000005 - momentum: 0.000000
200
+ 2023-10-19 00:39:58,720 epoch 9 - iter 258/432 - loss 0.06484547 - time (sec): 90.32 - samples/sec: 409.12 - lr: 0.000005 - momentum: 0.000000
201
+ 2023-10-19 00:40:13,802 epoch 9 - iter 301/432 - loss 0.06626985 - time (sec): 105.40 - samples/sec: 411.40 - lr: 0.000004 - momentum: 0.000000
202
+ 2023-10-19 00:40:28,801 epoch 9 - iter 344/432 - loss 0.06745724 - time (sec): 120.40 - samples/sec: 409.92 - lr: 0.000004 - momentum: 0.000000
203
+ 2023-10-19 00:40:43,958 epoch 9 - iter 387/432 - loss 0.06792850 - time (sec): 135.56 - samples/sec: 408.61 - lr: 0.000004 - momentum: 0.000000
204
+ 2023-10-19 00:40:59,712 epoch 9 - iter 430/432 - loss 0.06892771 - time (sec): 151.31 - samples/sec: 407.87 - lr: 0.000003 - momentum: 0.000000
205
+ 2023-10-19 00:41:00,072 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-19 00:41:00,072 EPOCH 9 done: loss 0.0689 - lr: 0.000003
207
+ 2023-10-19 00:41:13,916 DEV : loss 0.3687077760696411 - f1-score (micro avg) 0.839
208
+ 2023-10-19 00:41:13,941 ----------------------------------------------------------------------------------------------------
209
+ 2023-10-19 00:41:27,952 epoch 10 - iter 43/432 - loss 0.06914216 - time (sec): 14.01 - samples/sec: 423.21 - lr: 0.000003 - momentum: 0.000000
210
+ 2023-10-19 00:41:43,413 epoch 10 - iter 86/432 - loss 0.06348909 - time (sec): 29.47 - samples/sec: 402.17 - lr: 0.000003 - momentum: 0.000000
211
+ 2023-10-19 00:41:58,231 epoch 10 - iter 129/432 - loss 0.06567792 - time (sec): 44.29 - samples/sec: 408.14 - lr: 0.000002 - momentum: 0.000000
212
+ 2023-10-19 00:42:13,509 epoch 10 - iter 172/432 - loss 0.06502152 - time (sec): 59.57 - samples/sec: 413.44 - lr: 0.000002 - momentum: 0.000000
213
+ 2023-10-19 00:42:28,837 epoch 10 - iter 215/432 - loss 0.06585712 - time (sec): 74.89 - samples/sec: 417.08 - lr: 0.000002 - momentum: 0.000000
214
+ 2023-10-19 00:42:43,697 epoch 10 - iter 258/432 - loss 0.06365877 - time (sec): 89.75 - samples/sec: 417.12 - lr: 0.000001 - momentum: 0.000000
215
+ 2023-10-19 00:42:58,257 epoch 10 - iter 301/432 - loss 0.06317799 - time (sec): 104.31 - samples/sec: 418.47 - lr: 0.000001 - momentum: 0.000000
216
+ 2023-10-19 00:43:13,120 epoch 10 - iter 344/432 - loss 0.06253740 - time (sec): 119.18 - samples/sec: 418.00 - lr: 0.000001 - momentum: 0.000000
217
+ 2023-10-19 00:43:26,660 epoch 10 - iter 387/432 - loss 0.06022767 - time (sec): 132.72 - samples/sec: 419.38 - lr: 0.000000 - momentum: 0.000000
218
+ 2023-10-19 00:43:42,384 epoch 10 - iter 430/432 - loss 0.05945663 - time (sec): 148.44 - samples/sec: 415.36 - lr: 0.000000 - momentum: 0.000000
219
+ 2023-10-19 00:43:43,064 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-19 00:43:43,064 EPOCH 10 done: loss 0.0593 - lr: 0.000000
221
+ 2023-10-19 00:43:56,330 DEV : loss 0.37211665511131287 - f1-score (micro avg) 0.837
222
+ 2023-10-19 00:43:56,780 ----------------------------------------------------------------------------------------------------
223
+ 2023-10-19 00:43:56,781 Loading model from best epoch ...
224
+ 2023-10-19 00:43:58,908 SequenceTagger predicts: Dictionary with 81 tags: O, S-location-route, B-location-route, E-location-route, I-location-route, S-location-stop, B-location-stop, E-location-stop, I-location-stop, S-trigger, B-trigger, E-trigger, I-trigger, S-organization-company, B-organization-company, E-organization-company, I-organization-company, S-location-city, B-location-city, E-location-city, I-location-city, S-location, B-location, E-location, I-location, S-event-cause, B-event-cause, E-event-cause, I-event-cause, S-location-street, B-location-street, E-location-street, I-location-street, S-time, B-time, E-time, I-time, S-date, B-date, E-date, I-date, S-number, B-number, E-number, I-number, S-duration, B-duration, E-duration, I-duration, S-organization
225
+ 2023-10-19 00:44:16,847
226
+ Results:
227
+ - F-score (micro) 0.7524
228
+ - F-score (macro) 0.553
229
+ - Accuracy 0.6489
230
+
231
+ By class:
232
+ precision recall f1-score support
233
+
234
+ trigger 0.6983 0.6086 0.6504 833
235
+ location-stop 0.8598 0.7856 0.8210 765
236
+ location 0.8076 0.8331 0.8201 665
237
+ location-city 0.7705 0.8958 0.8284 566
238
+ date 0.8786 0.8452 0.8616 394
239
+ location-street 0.9137 0.8782 0.8956 386
240
+ time 0.7917 0.8906 0.8382 256
241
+ location-route 0.8138 0.7077 0.7571 284
242
+ organization-company 0.7578 0.6706 0.7116 252
243
+ number 0.6378 0.8389 0.7246 149
244
+ distance 1.0000 1.0000 1.0000 167
245
+ duration 0.3091 0.3129 0.3110 163
246
+ event-cause 0.0000 0.0000 0.0000 0
247
+ disaster-type 0.8333 0.1449 0.2469 69
248
+ organization 0.3750 0.5357 0.4412 28
249
+ person 0.4500 0.9000 0.6000 10
250
+ set 0.0000 0.0000 0.0000 0
251
+ org-position 0.0000 0.0000 0.0000 1
252
+ money 0.0000 0.0000 0.0000 0
253
+
254
+ micro avg 0.7403 0.7650 0.7524 4988
255
+ macro avg 0.5735 0.5709 0.5530 4988
256
+ weighted avg 0.7861 0.7650 0.7697 4988
257
+
258
+ 2023-10-19 00:44:16,847 ----------------------------------------------------------------------------------------------------