kaiokendev
/

superhot-30b-8k-no-rlhf-test

Model card Files Files and versions Community

kaiokendev commited on Jun 25, 2023

Commit

27a8de1

•

1 Parent(s): b71cff8

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -9,6 +9,12 @@ Tests have shown that the model does indeed leverage the extended context at 8K.
 You will need to **use either the monkeypatch** or, if you are already using the monkeypatch, **change the scaling factor to 0.25 and the maximum sequence length to 8192**
 I trained the LoRA with the following configuration:
 - 1200 samples (~400 samples over 2048 sequence length)
 - learning rate of 3e-4

 You will need to **use either the monkeypatch** or, if you are already using the monkeypatch, **change the scaling factor to 0.25 and the maximum sequence length to 8192**
+#### Looking for Merged & Quantized Models?
+30B 4-bit CUDA: [tmpupload/superhot-30b-8k-4bit-safetensors](https://huggingface.co/tmpupload/superhot-30b-8k-4bit-safetensors)
+30B 4-bit CUDA 128g: [tmpupload/superhot-30b-8k-4bit-128g-safetensors](https://huggingface.co/tmpupload/superhot-30b-8k-4bit-128g-safetensors)
+#### Training Details
 I trained the LoRA with the following configuration:
 - 1200 samples (~400 samples over 2048 sequence length)
 - learning rate of 3e-4