Delta-Vector
/

Odin-9B

Model card Files Files and versions Community

Delta-Vector commited on Oct 3

Commit

83d850a

•

1 Parent(s): 44e01ec

Update README.md

Files changed (1) hide show

README.md +1 -5

README.md CHANGED Viewed

@@ -215,8 +215,4 @@ The training was done for 2 epochs. We used  8 x [H100s](https://www.nvidia.com/
 ## Safety
-Avoid misusing this model, or you’ll need a ‘clicker’ to reset reality. ;)
-## Musings
-One of the members of Anthracite had quite an interesting idea, to finetune a smaller model for 4 epochs at a lower Learning rate as quote "Smaller models learn slower" - [Kalomaze]() provided access to 8 X A40s and We finetuned what now is [Darkens-8B]() for 4 epochs (and it's 2.5 Epoch version released as [Tor-8B]()) and the result was quite impressive, the 4 epoch model was not "overfit" at all and was rather pleasant to use. Lucy Knada then allowed me to do a full parameter finetune with the same configuration as Darkens/Tor-8B (With some minor dataset tweaks) on 8 * H100s, We hosted and tested the models and i ended up giving the green light to release the 4 epoch version at Magnum 9B V4 and released the 2 epoch version as my own. I felt both were extremely good models, but in testing i preferred the 2 epoch. It was not as "suggestive" as magnum models (and Claude RP log trained models) are. It would not dive into Claudeisms right out of the gate and you could use it for both Safe for work and "Other" purposes.


215
216	## Safety
217
218	+ Nein.