HaileyStorm
/

llama3-5.4b-instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

HaileyStorm commited on May 27

Commit

9f7c22e

•

1 Parent(s): a6f99c7

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -76,7 +76,7 @@ It is a prune of Meta-Llama-3-8B-Instruct from 32 layers down to 20, or about 5.
 Mostly, this is a test of (significant) pruning & healing an instruct-tuned model.
 ## Healing / Finetune
-I healed the model by doing a full weight DPO finetune for 139k samples (3.15 epochs), and then a LoRA with r=128 a=256 for 73k samples (1.67 epochs).
 Prior to healing, the model returned absolute gibberish to any prompt, rarely two real words together. For example, give "2+2=" it might return "Mahmisan Pannpyout Na RMITa CMI TTi GP BP GP RSi TBi DD PS..."

 Mostly, this is a test of (significant) pruning & healing an instruct-tuned model.
 ## Healing / Finetune
+I healed the model by doing a full weight DPO finetune for 139k samples (3.15 epochs), and then a LoRA with r=128 a=256 for 73k samples (1.67 epochs). Both had 8k sequence length.
 Prior to healing, the model returned absolute gibberish to any prompt, rarely two real words together. For example, give "2+2=" it might return "Mahmisan Pannpyout Na RMITa CMI TTi GP BP GP RSi TBi DD PS..."