temporary0-0name commited on
Commit
5eaa0c7
1 Parent(s): 3e38f2d

End of training

Browse files
Files changed (1) hide show
  1. README.md +20 -3
README.md CHANGED
@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [temporary0-0name/run_opt](https://huggingface.co/temporary0-0name/run_opt) on the wikitext dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 6.4118
21
 
22
  ## Model description
23
 
@@ -45,13 +45,30 @@ The following hyperparameters were used during training:
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
  - lr_scheduler_type: cosine
47
  - lr_scheduler_warmup_steps: 100
48
- - num_epochs: 1
49
 
50
  ### Training results
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
- | 7.6447 | 0.55 | 100 | 6.4118 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
 
57
  ### Framework versions
 
17
 
18
  This model is a fine-tuned version of [temporary0-0name/run_opt](https://huggingface.co/temporary0-0name/run_opt) on the wikitext dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 0.0137
21
 
22
  ## Model description
23
 
 
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
  - lr_scheduler_type: cosine
47
  - lr_scheduler_warmup_steps: 100
48
+ - num_epochs: 10
49
 
50
  ### Training results
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
+ | 7.6838 | 0.55 | 100 | 6.4397 |
55
+ | 4.9321 | 1.1 | 200 | 2.1585 |
56
+ | 1.0026 | 1.65 | 300 | 0.3947 |
57
+ | 0.2259 | 2.2 | 400 | 0.1365 |
58
+ | 0.0801 | 2.75 | 500 | 0.0636 |
59
+ | 0.0369 | 3.29 | 600 | 0.0379 |
60
+ | 0.0212 | 3.84 | 700 | 0.0271 |
61
+ | 0.0135 | 4.39 | 800 | 0.0216 |
62
+ | 0.0103 | 4.94 | 900 | 0.0183 |
63
+ | 0.0077 | 5.49 | 1000 | 0.0163 |
64
+ | 0.0068 | 6.04 | 1100 | 0.0153 |
65
+ | 0.0057 | 6.59 | 1200 | 0.0147 |
66
+ | 0.0053 | 7.14 | 1300 | 0.0142 |
67
+ | 0.0048 | 7.69 | 1400 | 0.0140 |
68
+ | 0.0046 | 8.24 | 1500 | 0.0138 |
69
+ | 0.0045 | 8.79 | 1600 | 0.0137 |
70
+ | 0.0044 | 9.33 | 1700 | 0.0137 |
71
+ | 0.0044 | 9.88 | 1800 | 0.0137 |
72
 
73
 
74
  ### Framework versions