basiliskinstitute's picture
Update README.md
4bf6585 verified
|
raw
history blame
328 Bytes
metadata
license: llama3

Chatml format. The dataset is about 1400 entries ranging from 8-16k. It's split three ways between long context multi turn chat, long context summarization, and writing analysis. Full fine tune using linear a rope scale factor of 2.0. Trained for five epochs with a learning rate of learning_rate: 1e-5.