stefan-it commited on
Commit
2122bc8
1 Parent(s): 99f55bf

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +265 -0
training.log ADDED
@@ -0,0 +1,265 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-03-26 11:16:46,211 ----------------------------------------------------------------------------------------------------
2
+ 2024-03-26 11:16:46,212 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(30001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2024-03-26 11:16:46,212 ----------------------------------------------------------------------------------------------------
51
+ 2024-03-26 11:16:46,212 Corpus: 758 train + 94 dev + 96 test sentences
52
+ 2024-03-26 11:16:46,212 ----------------------------------------------------------------------------------------------------
53
+ 2024-03-26 11:16:46,212 Train: 758 sentences
54
+ 2024-03-26 11:16:46,212 (train_with_dev=False, train_with_test=False)
55
+ 2024-03-26 11:16:46,212 ----------------------------------------------------------------------------------------------------
56
+ 2024-03-26 11:16:46,212 Training Params:
57
+ 2024-03-26 11:16:46,212 - learning_rate: "5e-05"
58
+ 2024-03-26 11:16:46,212 - mini_batch_size: "16"
59
+ 2024-03-26 11:16:46,212 - max_epochs: "10"
60
+ 2024-03-26 11:16:46,212 - shuffle: "True"
61
+ 2024-03-26 11:16:46,212 ----------------------------------------------------------------------------------------------------
62
+ 2024-03-26 11:16:46,212 Plugins:
63
+ 2024-03-26 11:16:46,212 - TensorboardLogger
64
+ 2024-03-26 11:16:46,212 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2024-03-26 11:16:46,212 ----------------------------------------------------------------------------------------------------
66
+ 2024-03-26 11:16:46,212 Final evaluation on model from best epoch (best-model.pt)
67
+ 2024-03-26 11:16:46,212 - metric: "('micro avg', 'f1-score')"
68
+ 2024-03-26 11:16:46,212 ----------------------------------------------------------------------------------------------------
69
+ 2024-03-26 11:16:46,212 Computation:
70
+ 2024-03-26 11:16:46,212 - compute on device: cuda:0
71
+ 2024-03-26 11:16:46,212 - embedding storage: none
72
+ 2024-03-26 11:16:46,212 ----------------------------------------------------------------------------------------------------
73
+ 2024-03-26 11:16:46,212 Model training base path: "flair-co-funer-german_bert_base-bs16-e10-lr5e-05-2"
74
+ 2024-03-26 11:16:46,212 ----------------------------------------------------------------------------------------------------
75
+ 2024-03-26 11:16:46,212 ----------------------------------------------------------------------------------------------------
76
+ 2024-03-26 11:16:46,212 Logging anything other than scalars to TensorBoard is currently not supported.
77
+ 2024-03-26 11:16:47,993 epoch 1 - iter 4/48 - loss 3.15943466 - time (sec): 1.78 - samples/sec: 1696.29 - lr: 0.000003 - momentum: 0.000000
78
+ 2024-03-26 11:16:50,199 epoch 1 - iter 8/48 - loss 3.05235430 - time (sec): 3.99 - samples/sec: 1557.38 - lr: 0.000007 - momentum: 0.000000
79
+ 2024-03-26 11:16:52,117 epoch 1 - iter 12/48 - loss 2.94913909 - time (sec): 5.90 - samples/sec: 1509.63 - lr: 0.000011 - momentum: 0.000000
80
+ 2024-03-26 11:16:54,122 epoch 1 - iter 16/48 - loss 2.77403858 - time (sec): 7.91 - samples/sec: 1534.06 - lr: 0.000016 - momentum: 0.000000
81
+ 2024-03-26 11:16:56,377 epoch 1 - iter 20/48 - loss 2.62217028 - time (sec): 10.16 - samples/sec: 1503.44 - lr: 0.000020 - momentum: 0.000000
82
+ 2024-03-26 11:16:59,474 epoch 1 - iter 24/48 - loss 2.48065772 - time (sec): 13.26 - samples/sec: 1370.87 - lr: 0.000024 - momentum: 0.000000
83
+ 2024-03-26 11:17:01,976 epoch 1 - iter 28/48 - loss 2.34149056 - time (sec): 15.76 - samples/sec: 1352.03 - lr: 0.000028 - momentum: 0.000000
84
+ 2024-03-26 11:17:02,812 epoch 1 - iter 32/48 - loss 2.25298393 - time (sec): 16.60 - samples/sec: 1406.67 - lr: 0.000032 - momentum: 0.000000
85
+ 2024-03-26 11:17:04,135 epoch 1 - iter 36/48 - loss 2.14698057 - time (sec): 17.92 - samples/sec: 1459.09 - lr: 0.000036 - momentum: 0.000000
86
+ 2024-03-26 11:17:06,067 epoch 1 - iter 40/48 - loss 2.04219894 - time (sec): 19.85 - samples/sec: 1465.20 - lr: 0.000041 - momentum: 0.000000
87
+ 2024-03-26 11:17:08,030 epoch 1 - iter 44/48 - loss 1.93332136 - time (sec): 21.82 - samples/sec: 1464.45 - lr: 0.000045 - momentum: 0.000000
88
+ 2024-03-26 11:17:09,428 epoch 1 - iter 48/48 - loss 1.84795953 - time (sec): 23.22 - samples/sec: 1484.86 - lr: 0.000049 - momentum: 0.000000
89
+ 2024-03-26 11:17:09,428 ----------------------------------------------------------------------------------------------------
90
+ 2024-03-26 11:17:09,428 EPOCH 1 done: loss 1.8480 - lr: 0.000049
91
+ 2024-03-26 11:17:10,350 DEV : loss 0.5429823398590088 - f1-score (micro avg) 0.633
92
+ 2024-03-26 11:17:10,351 saving best model
93
+ 2024-03-26 11:17:10,630 ----------------------------------------------------------------------------------------------------
94
+ 2024-03-26 11:17:11,954 epoch 2 - iter 4/48 - loss 0.75282481 - time (sec): 1.32 - samples/sec: 2192.70 - lr: 0.000050 - momentum: 0.000000
95
+ 2024-03-26 11:17:13,819 epoch 2 - iter 8/48 - loss 0.62087530 - time (sec): 3.19 - samples/sec: 1912.95 - lr: 0.000049 - momentum: 0.000000
96
+ 2024-03-26 11:17:17,322 epoch 2 - iter 12/48 - loss 0.52501189 - time (sec): 6.69 - samples/sec: 1520.86 - lr: 0.000049 - momentum: 0.000000
97
+ 2024-03-26 11:17:19,881 epoch 2 - iter 16/48 - loss 0.49391674 - time (sec): 9.25 - samples/sec: 1439.80 - lr: 0.000048 - momentum: 0.000000
98
+ 2024-03-26 11:17:22,688 epoch 2 - iter 20/48 - loss 0.46323174 - time (sec): 12.06 - samples/sec: 1377.78 - lr: 0.000048 - momentum: 0.000000
99
+ 2024-03-26 11:17:24,686 epoch 2 - iter 24/48 - loss 0.44126141 - time (sec): 14.06 - samples/sec: 1371.64 - lr: 0.000047 - momentum: 0.000000
100
+ 2024-03-26 11:17:26,494 epoch 2 - iter 28/48 - loss 0.44609745 - time (sec): 15.86 - samples/sec: 1382.57 - lr: 0.000047 - momentum: 0.000000
101
+ 2024-03-26 11:17:28,311 epoch 2 - iter 32/48 - loss 0.44004278 - time (sec): 17.68 - samples/sec: 1391.63 - lr: 0.000046 - momentum: 0.000000
102
+ 2024-03-26 11:17:30,239 epoch 2 - iter 36/48 - loss 0.42823178 - time (sec): 19.61 - samples/sec: 1399.00 - lr: 0.000046 - momentum: 0.000000
103
+ 2024-03-26 11:17:31,269 epoch 2 - iter 40/48 - loss 0.42345263 - time (sec): 20.64 - samples/sec: 1446.43 - lr: 0.000046 - momentum: 0.000000
104
+ 2024-03-26 11:17:32,746 epoch 2 - iter 44/48 - loss 0.42198128 - time (sec): 22.11 - samples/sec: 1465.79 - lr: 0.000045 - momentum: 0.000000
105
+ 2024-03-26 11:17:34,323 epoch 2 - iter 48/48 - loss 0.40881526 - time (sec): 23.69 - samples/sec: 1454.96 - lr: 0.000045 - momentum: 0.000000
106
+ 2024-03-26 11:17:34,324 ----------------------------------------------------------------------------------------------------
107
+ 2024-03-26 11:17:34,324 EPOCH 2 done: loss 0.4088 - lr: 0.000045
108
+ 2024-03-26 11:17:35,254 DEV : loss 0.28016945719718933 - f1-score (micro avg) 0.8134
109
+ 2024-03-26 11:17:35,255 saving best model
110
+ 2024-03-26 11:17:35,702 ----------------------------------------------------------------------------------------------------
111
+ 2024-03-26 11:17:38,267 epoch 3 - iter 4/48 - loss 0.23345348 - time (sec): 2.56 - samples/sec: 1173.72 - lr: 0.000044 - momentum: 0.000000
112
+ 2024-03-26 11:17:40,465 epoch 3 - iter 8/48 - loss 0.23145403 - time (sec): 4.76 - samples/sec: 1333.40 - lr: 0.000044 - momentum: 0.000000
113
+ 2024-03-26 11:17:42,063 epoch 3 - iter 12/48 - loss 0.23780941 - time (sec): 6.36 - samples/sec: 1394.93 - lr: 0.000043 - momentum: 0.000000
114
+ 2024-03-26 11:17:43,845 epoch 3 - iter 16/48 - loss 0.22351506 - time (sec): 8.14 - samples/sec: 1395.99 - lr: 0.000043 - momentum: 0.000000
115
+ 2024-03-26 11:17:45,035 epoch 3 - iter 20/48 - loss 0.22899955 - time (sec): 9.33 - samples/sec: 1466.21 - lr: 0.000042 - momentum: 0.000000
116
+ 2024-03-26 11:17:46,912 epoch 3 - iter 24/48 - loss 0.23890119 - time (sec): 11.21 - samples/sec: 1467.99 - lr: 0.000042 - momentum: 0.000000
117
+ 2024-03-26 11:17:49,413 epoch 3 - iter 28/48 - loss 0.23771145 - time (sec): 13.71 - samples/sec: 1411.16 - lr: 0.000041 - momentum: 0.000000
118
+ 2024-03-26 11:17:51,287 epoch 3 - iter 32/48 - loss 0.23968050 - time (sec): 15.58 - samples/sec: 1420.58 - lr: 0.000041 - momentum: 0.000000
119
+ 2024-03-26 11:17:52,761 epoch 3 - iter 36/48 - loss 0.23251473 - time (sec): 17.06 - samples/sec: 1452.67 - lr: 0.000040 - momentum: 0.000000
120
+ 2024-03-26 11:17:55,069 epoch 3 - iter 40/48 - loss 0.22549522 - time (sec): 19.37 - samples/sec: 1425.67 - lr: 0.000040 - momentum: 0.000000
121
+ 2024-03-26 11:17:58,436 epoch 3 - iter 44/48 - loss 0.20910800 - time (sec): 22.73 - samples/sec: 1417.46 - lr: 0.000040 - momentum: 0.000000
122
+ 2024-03-26 11:17:59,781 epoch 3 - iter 48/48 - loss 0.20549323 - time (sec): 24.08 - samples/sec: 1431.65 - lr: 0.000039 - momentum: 0.000000
123
+ 2024-03-26 11:17:59,782 ----------------------------------------------------------------------------------------------------
124
+ 2024-03-26 11:17:59,782 EPOCH 3 done: loss 0.2055 - lr: 0.000039
125
+ 2024-03-26 11:18:00,714 DEV : loss 0.2106706202030182 - f1-score (micro avg) 0.8663
126
+ 2024-03-26 11:18:00,715 saving best model
127
+ 2024-03-26 11:18:01,155 ----------------------------------------------------------------------------------------------------
128
+ 2024-03-26 11:18:02,763 epoch 4 - iter 4/48 - loss 0.21264590 - time (sec): 1.61 - samples/sec: 1588.44 - lr: 0.000039 - momentum: 0.000000
129
+ 2024-03-26 11:18:05,112 epoch 4 - iter 8/48 - loss 0.16186058 - time (sec): 3.95 - samples/sec: 1515.65 - lr: 0.000038 - momentum: 0.000000
130
+ 2024-03-26 11:18:06,364 epoch 4 - iter 12/48 - loss 0.15033662 - time (sec): 5.21 - samples/sec: 1605.29 - lr: 0.000038 - momentum: 0.000000
131
+ 2024-03-26 11:18:08,647 epoch 4 - iter 16/48 - loss 0.14703161 - time (sec): 7.49 - samples/sec: 1505.41 - lr: 0.000037 - momentum: 0.000000
132
+ 2024-03-26 11:18:11,277 epoch 4 - iter 20/48 - loss 0.13607905 - time (sec): 10.12 - samples/sec: 1381.72 - lr: 0.000037 - momentum: 0.000000
133
+ 2024-03-26 11:18:13,388 epoch 4 - iter 24/48 - loss 0.14432807 - time (sec): 12.23 - samples/sec: 1376.38 - lr: 0.000036 - momentum: 0.000000
134
+ 2024-03-26 11:18:15,527 epoch 4 - iter 28/48 - loss 0.14018550 - time (sec): 14.37 - samples/sec: 1384.49 - lr: 0.000036 - momentum: 0.000000
135
+ 2024-03-26 11:18:18,192 epoch 4 - iter 32/48 - loss 0.13898599 - time (sec): 17.03 - samples/sec: 1353.72 - lr: 0.000035 - momentum: 0.000000
136
+ 2024-03-26 11:18:21,031 epoch 4 - iter 36/48 - loss 0.13184037 - time (sec): 19.87 - samples/sec: 1346.00 - lr: 0.000035 - momentum: 0.000000
137
+ 2024-03-26 11:18:22,799 epoch 4 - iter 40/48 - loss 0.12873724 - time (sec): 21.64 - samples/sec: 1344.37 - lr: 0.000034 - momentum: 0.000000
138
+ 2024-03-26 11:18:24,883 epoch 4 - iter 44/48 - loss 0.12670940 - time (sec): 23.73 - samples/sec: 1345.51 - lr: 0.000034 - momentum: 0.000000
139
+ 2024-03-26 11:18:26,610 epoch 4 - iter 48/48 - loss 0.12560460 - time (sec): 25.45 - samples/sec: 1354.39 - lr: 0.000034 - momentum: 0.000000
140
+ 2024-03-26 11:18:26,610 ----------------------------------------------------------------------------------------------------
141
+ 2024-03-26 11:18:26,610 EPOCH 4 done: loss 0.1256 - lr: 0.000034
142
+ 2024-03-26 11:18:27,551 DEV : loss 0.1935855746269226 - f1-score (micro avg) 0.8846
143
+ 2024-03-26 11:18:27,552 saving best model
144
+ 2024-03-26 11:18:27,984 ----------------------------------------------------------------------------------------------------
145
+ 2024-03-26 11:18:28,825 epoch 5 - iter 4/48 - loss 0.05091057 - time (sec): 0.84 - samples/sec: 2181.44 - lr: 0.000033 - momentum: 0.000000
146
+ 2024-03-26 11:18:30,239 epoch 5 - iter 8/48 - loss 0.07272383 - time (sec): 2.25 - samples/sec: 1973.48 - lr: 0.000033 - momentum: 0.000000
147
+ 2024-03-26 11:18:33,183 epoch 5 - iter 12/48 - loss 0.07617273 - time (sec): 5.20 - samples/sec: 1535.24 - lr: 0.000032 - momentum: 0.000000
148
+ 2024-03-26 11:18:36,272 epoch 5 - iter 16/48 - loss 0.07780937 - time (sec): 8.29 - samples/sec: 1361.74 - lr: 0.000032 - momentum: 0.000000
149
+ 2024-03-26 11:18:37,687 epoch 5 - iter 20/48 - loss 0.08570020 - time (sec): 9.70 - samples/sec: 1414.94 - lr: 0.000031 - momentum: 0.000000
150
+ 2024-03-26 11:18:40,231 epoch 5 - iter 24/48 - loss 0.08408737 - time (sec): 12.25 - samples/sec: 1368.21 - lr: 0.000031 - momentum: 0.000000
151
+ 2024-03-26 11:18:42,356 epoch 5 - iter 28/48 - loss 0.08255117 - time (sec): 14.37 - samples/sec: 1359.59 - lr: 0.000030 - momentum: 0.000000
152
+ 2024-03-26 11:18:44,712 epoch 5 - iter 32/48 - loss 0.08552872 - time (sec): 16.73 - samples/sec: 1384.72 - lr: 0.000030 - momentum: 0.000000
153
+ 2024-03-26 11:18:46,223 epoch 5 - iter 36/48 - loss 0.08821291 - time (sec): 18.24 - samples/sec: 1408.45 - lr: 0.000029 - momentum: 0.000000
154
+ 2024-03-26 11:18:48,787 epoch 5 - iter 40/48 - loss 0.08411875 - time (sec): 20.80 - samples/sec: 1365.75 - lr: 0.000029 - momentum: 0.000000
155
+ 2024-03-26 11:18:50,922 epoch 5 - iter 44/48 - loss 0.08515196 - time (sec): 22.94 - samples/sec: 1379.22 - lr: 0.000029 - momentum: 0.000000
156
+ 2024-03-26 11:18:52,886 epoch 5 - iter 48/48 - loss 0.08656351 - time (sec): 24.90 - samples/sec: 1384.33 - lr: 0.000028 - momentum: 0.000000
157
+ 2024-03-26 11:18:52,887 ----------------------------------------------------------------------------------------------------
158
+ 2024-03-26 11:18:52,887 EPOCH 5 done: loss 0.0866 - lr: 0.000028
159
+ 2024-03-26 11:18:53,817 DEV : loss 0.19532091915607452 - f1-score (micro avg) 0.9016
160
+ 2024-03-26 11:18:53,819 saving best model
161
+ 2024-03-26 11:18:54,268 ----------------------------------------------------------------------------------------------------
162
+ 2024-03-26 11:18:55,970 epoch 6 - iter 4/48 - loss 0.08548813 - time (sec): 1.70 - samples/sec: 1464.57 - lr: 0.000028 - momentum: 0.000000
163
+ 2024-03-26 11:18:58,418 epoch 6 - iter 8/48 - loss 0.07937192 - time (sec): 4.15 - samples/sec: 1542.85 - lr: 0.000027 - momentum: 0.000000
164
+ 2024-03-26 11:19:00,401 epoch 6 - iter 12/48 - loss 0.07110090 - time (sec): 6.13 - samples/sec: 1477.40 - lr: 0.000027 - momentum: 0.000000
165
+ 2024-03-26 11:19:02,537 epoch 6 - iter 16/48 - loss 0.06863509 - time (sec): 8.27 - samples/sec: 1466.76 - lr: 0.000026 - momentum: 0.000000
166
+ 2024-03-26 11:19:05,304 epoch 6 - iter 20/48 - loss 0.06768250 - time (sec): 11.03 - samples/sec: 1447.96 - lr: 0.000026 - momentum: 0.000000
167
+ 2024-03-26 11:19:06,872 epoch 6 - iter 24/48 - loss 0.07508927 - time (sec): 12.60 - samples/sec: 1468.79 - lr: 0.000025 - momentum: 0.000000
168
+ 2024-03-26 11:19:08,298 epoch 6 - iter 28/48 - loss 0.07565420 - time (sec): 14.03 - samples/sec: 1473.36 - lr: 0.000025 - momentum: 0.000000
169
+ 2024-03-26 11:19:09,509 epoch 6 - iter 32/48 - loss 0.07301484 - time (sec): 15.24 - samples/sec: 1492.92 - lr: 0.000024 - momentum: 0.000000
170
+ 2024-03-26 11:19:11,034 epoch 6 - iter 36/48 - loss 0.06942201 - time (sec): 16.76 - samples/sec: 1523.08 - lr: 0.000024 - momentum: 0.000000
171
+ 2024-03-26 11:19:13,012 epoch 6 - iter 40/48 - loss 0.07133967 - time (sec): 18.74 - samples/sec: 1512.02 - lr: 0.000023 - momentum: 0.000000
172
+ 2024-03-26 11:19:15,300 epoch 6 - iter 44/48 - loss 0.06908886 - time (sec): 21.03 - samples/sec: 1528.97 - lr: 0.000023 - momentum: 0.000000
173
+ 2024-03-26 11:19:17,041 epoch 6 - iter 48/48 - loss 0.06967724 - time (sec): 22.77 - samples/sec: 1513.89 - lr: 0.000023 - momentum: 0.000000
174
+ 2024-03-26 11:19:17,041 ----------------------------------------------------------------------------------------------------
175
+ 2024-03-26 11:19:17,041 EPOCH 6 done: loss 0.0697 - lr: 0.000023
176
+ 2024-03-26 11:19:17,975 DEV : loss 0.1747245490550995 - f1-score (micro avg) 0.916
177
+ 2024-03-26 11:19:17,976 saving best model
178
+ 2024-03-26 11:19:18,419 ----------------------------------------------------------------------------------------------------
179
+ 2024-03-26 11:19:20,065 epoch 7 - iter 4/48 - loss 0.04385190 - time (sec): 1.64 - samples/sec: 1481.61 - lr: 0.000022 - momentum: 0.000000
180
+ 2024-03-26 11:19:21,725 epoch 7 - iter 8/48 - loss 0.05737850 - time (sec): 3.30 - samples/sec: 1499.05 - lr: 0.000022 - momentum: 0.000000
181
+ 2024-03-26 11:19:23,900 epoch 7 - iter 12/48 - loss 0.05225209 - time (sec): 5.48 - samples/sec: 1436.43 - lr: 0.000021 - momentum: 0.000000
182
+ 2024-03-26 11:19:25,967 epoch 7 - iter 16/48 - loss 0.04753228 - time (sec): 7.55 - samples/sec: 1476.37 - lr: 0.000021 - momentum: 0.000000
183
+ 2024-03-26 11:19:26,626 epoch 7 - iter 20/48 - loss 0.04517211 - time (sec): 8.21 - samples/sec: 1579.39 - lr: 0.000020 - momentum: 0.000000
184
+ 2024-03-26 11:19:28,231 epoch 7 - iter 24/48 - loss 0.04582161 - time (sec): 9.81 - samples/sec: 1561.85 - lr: 0.000020 - momentum: 0.000000
185
+ 2024-03-26 11:19:31,160 epoch 7 - iter 28/48 - loss 0.04440788 - time (sec): 12.74 - samples/sec: 1461.94 - lr: 0.000019 - momentum: 0.000000
186
+ 2024-03-26 11:19:33,988 epoch 7 - iter 32/48 - loss 0.04388928 - time (sec): 15.57 - samples/sec: 1391.71 - lr: 0.000019 - momentum: 0.000000
187
+ 2024-03-26 11:19:36,840 epoch 7 - iter 36/48 - loss 0.04910196 - time (sec): 18.42 - samples/sec: 1399.69 - lr: 0.000018 - momentum: 0.000000
188
+ 2024-03-26 11:19:38,832 epoch 7 - iter 40/48 - loss 0.05279163 - time (sec): 20.41 - samples/sec: 1408.42 - lr: 0.000018 - momentum: 0.000000
189
+ 2024-03-26 11:19:41,428 epoch 7 - iter 44/48 - loss 0.05326335 - time (sec): 23.01 - samples/sec: 1384.52 - lr: 0.000017 - momentum: 0.000000
190
+ 2024-03-26 11:19:43,269 epoch 7 - iter 48/48 - loss 0.05213195 - time (sec): 24.85 - samples/sec: 1387.33 - lr: 0.000017 - momentum: 0.000000
191
+ 2024-03-26 11:19:43,269 ----------------------------------------------------------------------------------------------------
192
+ 2024-03-26 11:19:43,269 EPOCH 7 done: loss 0.0521 - lr: 0.000017
193
+ 2024-03-26 11:19:44,202 DEV : loss 0.1832708865404129 - f1-score (micro avg) 0.911
194
+ 2024-03-26 11:19:44,203 ----------------------------------------------------------------------------------------------------
195
+ 2024-03-26 11:19:46,904 epoch 8 - iter 4/48 - loss 0.04675538 - time (sec): 2.70 - samples/sec: 1223.02 - lr: 0.000017 - momentum: 0.000000
196
+ 2024-03-26 11:19:49,044 epoch 8 - iter 8/48 - loss 0.03493802 - time (sec): 4.84 - samples/sec: 1212.36 - lr: 0.000016 - momentum: 0.000000
197
+ 2024-03-26 11:19:52,242 epoch 8 - iter 12/48 - loss 0.03381936 - time (sec): 8.04 - samples/sec: 1205.64 - lr: 0.000016 - momentum: 0.000000
198
+ 2024-03-26 11:19:54,226 epoch 8 - iter 16/48 - loss 0.04243199 - time (sec): 10.02 - samples/sec: 1231.53 - lr: 0.000015 - momentum: 0.000000
199
+ 2024-03-26 11:19:55,720 epoch 8 - iter 20/48 - loss 0.03854979 - time (sec): 11.52 - samples/sec: 1275.09 - lr: 0.000015 - momentum: 0.000000
200
+ 2024-03-26 11:19:58,301 epoch 8 - iter 24/48 - loss 0.03855591 - time (sec): 14.10 - samples/sec: 1265.80 - lr: 0.000014 - momentum: 0.000000
201
+ 2024-03-26 11:20:00,098 epoch 8 - iter 28/48 - loss 0.04124147 - time (sec): 15.89 - samples/sec: 1300.94 - lr: 0.000014 - momentum: 0.000000
202
+ 2024-03-26 11:20:01,734 epoch 8 - iter 32/48 - loss 0.03962775 - time (sec): 17.53 - samples/sec: 1327.09 - lr: 0.000013 - momentum: 0.000000
203
+ 2024-03-26 11:20:03,026 epoch 8 - iter 36/48 - loss 0.03946872 - time (sec): 18.82 - samples/sec: 1359.20 - lr: 0.000013 - momentum: 0.000000
204
+ 2024-03-26 11:20:05,440 epoch 8 - iter 40/48 - loss 0.03972045 - time (sec): 21.24 - samples/sec: 1365.27 - lr: 0.000012 - momentum: 0.000000
205
+ 2024-03-26 11:20:08,341 epoch 8 - iter 44/48 - loss 0.03737704 - time (sec): 24.14 - samples/sec: 1334.75 - lr: 0.000012 - momentum: 0.000000
206
+ 2024-03-26 11:20:10,427 epoch 8 - iter 48/48 - loss 0.03797592 - time (sec): 26.22 - samples/sec: 1314.57 - lr: 0.000011 - momentum: 0.000000
207
+ 2024-03-26 11:20:10,427 ----------------------------------------------------------------------------------------------------
208
+ 2024-03-26 11:20:10,427 EPOCH 8 done: loss 0.0380 - lr: 0.000011
209
+ 2024-03-26 11:20:11,369 DEV : loss 0.1888364851474762 - f1-score (micro avg) 0.9299
210
+ 2024-03-26 11:20:11,370 saving best model
211
+ 2024-03-26 11:20:11,812 ----------------------------------------------------------------------------------------------------
212
+ 2024-03-26 11:20:13,741 epoch 9 - iter 4/48 - loss 0.03350915 - time (sec): 1.93 - samples/sec: 1476.17 - lr: 0.000011 - momentum: 0.000000
213
+ 2024-03-26 11:20:16,180 epoch 9 - iter 8/48 - loss 0.02728296 - time (sec): 4.37 - samples/sec: 1404.75 - lr: 0.000011 - momentum: 0.000000
214
+ 2024-03-26 11:20:18,591 epoch 9 - iter 12/48 - loss 0.03564997 - time (sec): 6.78 - samples/sec: 1362.34 - lr: 0.000010 - momentum: 0.000000
215
+ 2024-03-26 11:20:20,699 epoch 9 - iter 16/48 - loss 0.03661492 - time (sec): 8.88 - samples/sec: 1361.41 - lr: 0.000010 - momentum: 0.000000
216
+ 2024-03-26 11:20:22,204 epoch 9 - iter 20/48 - loss 0.03193217 - time (sec): 10.39 - samples/sec: 1418.57 - lr: 0.000009 - momentum: 0.000000
217
+ 2024-03-26 11:20:23,438 epoch 9 - iter 24/48 - loss 0.02928490 - time (sec): 11.62 - samples/sec: 1465.75 - lr: 0.000009 - momentum: 0.000000
218
+ 2024-03-26 11:20:25,136 epoch 9 - iter 28/48 - loss 0.02830002 - time (sec): 13.32 - samples/sec: 1484.58 - lr: 0.000008 - momentum: 0.000000
219
+ 2024-03-26 11:20:27,488 epoch 9 - iter 32/48 - loss 0.03365236 - time (sec): 15.67 - samples/sec: 1467.13 - lr: 0.000008 - momentum: 0.000000
220
+ 2024-03-26 11:20:30,228 epoch 9 - iter 36/48 - loss 0.03477106 - time (sec): 18.41 - samples/sec: 1418.66 - lr: 0.000007 - momentum: 0.000000
221
+ 2024-03-26 11:20:33,201 epoch 9 - iter 40/48 - loss 0.03467439 - time (sec): 21.39 - samples/sec: 1378.21 - lr: 0.000007 - momentum: 0.000000
222
+ 2024-03-26 11:20:35,067 epoch 9 - iter 44/48 - loss 0.03450948 - time (sec): 23.25 - samples/sec: 1392.58 - lr: 0.000006 - momentum: 0.000000
223
+ 2024-03-26 11:20:36,124 epoch 9 - iter 48/48 - loss 0.03400891 - time (sec): 24.31 - samples/sec: 1418.07 - lr: 0.000006 - momentum: 0.000000
224
+ 2024-03-26 11:20:36,124 ----------------------------------------------------------------------------------------------------
225
+ 2024-03-26 11:20:36,124 EPOCH 9 done: loss 0.0340 - lr: 0.000006
226
+ 2024-03-26 11:20:37,072 DEV : loss 0.1788191944360733 - f1-score (micro avg) 0.9328
227
+ 2024-03-26 11:20:37,073 saving best model
228
+ 2024-03-26 11:20:37,537 ----------------------------------------------------------------------------------------------------
229
+ 2024-03-26 11:20:39,897 epoch 10 - iter 4/48 - loss 0.01213481 - time (sec): 2.36 - samples/sec: 1400.54 - lr: 0.000006 - momentum: 0.000000
230
+ 2024-03-26 11:20:42,071 epoch 10 - iter 8/48 - loss 0.01599539 - time (sec): 4.53 - samples/sec: 1363.25 - lr: 0.000005 - momentum: 0.000000
231
+ 2024-03-26 11:20:43,992 epoch 10 - iter 12/48 - loss 0.01796113 - time (sec): 6.45 - samples/sec: 1367.47 - lr: 0.000005 - momentum: 0.000000
232
+ 2024-03-26 11:20:45,227 epoch 10 - iter 16/48 - loss 0.01897141 - time (sec): 7.69 - samples/sec: 1433.43 - lr: 0.000004 - momentum: 0.000000
233
+ 2024-03-26 11:20:47,232 epoch 10 - iter 20/48 - loss 0.02538412 - time (sec): 9.69 - samples/sec: 1414.27 - lr: 0.000004 - momentum: 0.000000
234
+ 2024-03-26 11:20:49,589 epoch 10 - iter 24/48 - loss 0.03308522 - time (sec): 12.05 - samples/sec: 1378.16 - lr: 0.000003 - momentum: 0.000000
235
+ 2024-03-26 11:20:50,495 epoch 10 - iter 28/48 - loss 0.03246447 - time (sec): 12.96 - samples/sec: 1450.20 - lr: 0.000003 - momentum: 0.000000
236
+ 2024-03-26 11:20:51,826 epoch 10 - iter 32/48 - loss 0.03099225 - time (sec): 14.29 - samples/sec: 1486.63 - lr: 0.000002 - momentum: 0.000000
237
+ 2024-03-26 11:20:54,658 epoch 10 - iter 36/48 - loss 0.02893431 - time (sec): 17.12 - samples/sec: 1442.43 - lr: 0.000002 - momentum: 0.000000
238
+ 2024-03-26 11:20:57,051 epoch 10 - iter 40/48 - loss 0.02897645 - time (sec): 19.51 - samples/sec: 1473.63 - lr: 0.000001 - momentum: 0.000000
239
+ 2024-03-26 11:20:59,691 epoch 10 - iter 44/48 - loss 0.02902513 - time (sec): 22.15 - samples/sec: 1447.93 - lr: 0.000001 - momentum: 0.000000
240
+ 2024-03-26 11:21:01,695 epoch 10 - iter 48/48 - loss 0.02812175 - time (sec): 24.16 - samples/sec: 1427.09 - lr: 0.000000 - momentum: 0.000000
241
+ 2024-03-26 11:21:01,695 ----------------------------------------------------------------------------------------------------
242
+ 2024-03-26 11:21:01,695 EPOCH 10 done: loss 0.0281 - lr: 0.000000
243
+ 2024-03-26 11:21:02,635 DEV : loss 0.1868205964565277 - f1-score (micro avg) 0.9321
244
+ 2024-03-26 11:21:02,915 ----------------------------------------------------------------------------------------------------
245
+ 2024-03-26 11:21:02,915 Loading model from best epoch ...
246
+ 2024-03-26 11:21:03,816 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
247
+ 2024-03-26 11:21:04,576
248
+ Results:
249
+ - F-score (micro) 0.9095
250
+ - F-score (macro) 0.6907
251
+ - Accuracy 0.8364
252
+
253
+ By class:
254
+ precision recall f1-score support
255
+
256
+ Unternehmen 0.9109 0.8835 0.8969 266
257
+ Auslagerung 0.8760 0.9076 0.8915 249
258
+ Ort 0.9635 0.9851 0.9742 134
259
+ Software 0.0000 0.0000 0.0000 0
260
+
261
+ micro avg 0.9053 0.9137 0.9095 649
262
+ macro avg 0.6876 0.6940 0.6907 649
263
+ weighted avg 0.9083 0.9137 0.9108 649
264
+
265
+ 2024-03-26 11:21:04,576 ----------------------------------------------------------------------------------------------------