SnakyMcSnekFace
commited on
Commit
•
00ea061
1
Parent(s):
19bd185
Update README.md
Browse files
README.md
CHANGED
@@ -92,6 +92,8 @@ Training parameters:
|
|
92 |
- Sample size: 768 tokens
|
93 |
- Samples per epoch: 47420
|
94 |
- Number of epochs: 2
|
|
|
|
|
95 |
- First epoch: Learning rate = 3e-4, 1000 steps warmup, cosine schedule
|
96 |
- Second epoch: Learning rate = 1e-4, 256 steps warmup, inverse sqrt schedule
|
97 |
|
|
|
92 |
- Sample size: 768 tokens
|
93 |
- Samples per epoch: 47420
|
94 |
- Number of epochs: 2
|
95 |
+
- Batch size: 1
|
96 |
+
- Gradient accumulation steps: 16
|
97 |
- First epoch: Learning rate = 3e-4, 1000 steps warmup, cosine schedule
|
98 |
- Second epoch: Learning rate = 1e-4, 256 steps warmup, inverse sqrt schedule
|
99 |
|