budecosystem
/

code-millenials-13b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

dittops commited on Jan 2

Commit

72ed7c1

•

1 Parent(s): 10fc4ac

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -68,7 +68,7 @@ The model is trained of 8 A100 80GB for approximately 15hrs.
 | per_device_train_batch_size  | 2      |
 | gradient_accumulation_steps  | 1      |
 | epoch | 3 |
-| steps | 19206 |
 | learning_rate                | 2e-5   |
 | lr schedular type | cosine |
 | warmup ratio | 0.1 |

 | per_device_train_batch_size  | 2      |
 | gradient_accumulation_steps  | 1      |
 | epoch | 3 |
+| steps | 34503 |
 | learning_rate                | 2e-5   |
 | lr schedular type | cosine |
 | warmup ratio | 0.1 |