experiment_2_8b-fp16
Another experimental train w/ unsloth. This time, roughly 0.6 epochs of the cleaned c2-logs. My metaparams are probably bad, since the loss-value was super weird at the end. Also uploaded another version in the checkpoint-3500
-branch that may mitigate some of that.