SnakyMcSnekFace commited on
Commit
00ea061
1 Parent(s): 19bd185

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -92,6 +92,8 @@ Training parameters:
92
  - Sample size: 768 tokens
93
  - Samples per epoch: 47420
94
  - Number of epochs: 2
 
 
95
  - First epoch: Learning rate = 3e-4, 1000 steps warmup, cosine schedule
96
  - Second epoch: Learning rate = 1e-4, 256 steps warmup, inverse sqrt schedule
97
 
 
92
  - Sample size: 768 tokens
93
  - Samples per epoch: 47420
94
  - Number of epochs: 2
95
+ - Batch size: 1
96
+ - Gradient accumulation steps: 16
97
  - First epoch: Learning rate = 3e-4, 1000 steps warmup, cosine schedule
98
  - Second epoch: Learning rate = 1e-4, 256 steps warmup, inverse sqrt schedule
99