tmpupload
/

superhot-13b-8k-no-rlhf-test-32g-GPTQ

Text Generation

Inference Endpoints

Model card Files Files and versions Community

tmpupload commited on Jun 27, 2023

Commit

454f916

•

1 Parent(s): 80c97f6

Update README.md

Files changed (1) hide show

README.md +7 -2

README.md CHANGED Viewed

@@ -1,12 +1,17 @@
 # superhot-13b-8k-4bit-32g-safetensors
 Merged base LLaMA and LoRA with this:
 https://github.com/tloen/alpaca-lora
 Base LLaMA 13B:
 https://huggingface.co/huggyllama/llama-13b
-SuperCOT 13B 8k LoRA:
 https://huggingface.co/kaiokendev/superhot-13b-8k-no-rlhf-test
 ``` sh
@@ -56,4 +61,4 @@ CUDA_VISIBLE_DEVICES=0 python test_benchmark_inference.py \
  -- Loading dataset...
  -- Testing 40 chunks....
  ** Perplexity: 5.4066
-```

+---
+license: other
+---
 # superhot-13b-8k-4bit-32g-safetensors
+**Note: Maximum sequence length (max_seq_len) and compression factor (compress_pos_emb) need to be set to 8192 and 4.**
 Merged base LLaMA and LoRA with this:
 https://github.com/tloen/alpaca-lora
 Base LLaMA 13B:
 https://huggingface.co/huggyllama/llama-13b
+SuperHOT 13B 8k no-rlhf-test LoRA:
 https://huggingface.co/kaiokendev/superhot-13b-8k-no-rlhf-test
 ``` sh
  -- Loading dataset...
  -- Testing 40 chunks....
  ** Perplexity: 5.4066
+```