namespace-Pt commited on
Commit
0a717d4
1 Parent(s): 0851965

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +5 -4
  2. data/needle.png +2 -2
  3. data/topic.png +0 -0
README.md CHANGED
@@ -6,7 +6,7 @@ pipeline_tag: text-generation
6
  <div align="center">
7
  <h1>Llama-3-8B-Instruct-80K-QLoRA</h1>
8
 
9
- <a href="https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon/new/docs/llama3-8b-instruct-qlora-80k.md">[Data&Code]</a>
10
  </div>
11
 
12
  We extend the context length of Llama-3-8B-Instruct to 80K using QLoRA and 3.5K long-context training data synthesized from GPT-4. The entire training cycle is super efficient, which takes 8 hours on a 8xA800 (80G) machine. Yet, the resulted model achieves remarkable performance on a series of downstream long-context evaluation benchmarks.
@@ -27,9 +27,9 @@ We evaluate the model on [LongBench](https://arxiv.org/abs/2308.14508) using 32K
27
 
28
  |Model|Single-Doc QA|Multi-Doc QA|Summarization|Few-Shot Learning|Synthetic|Code|
29
  |:-:|:-:|:-:|:-:|:-:|:-:|:-:|
30
- |[meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)|37.33|36.04|26.83|69.56|37.75|53.24|
31
  |[gradientai/Llama-3-8B-Instruct-262k](https://huggingface.co/NousResearch/Yarn-Mistral-7b-128k)|37.29|31.20|26.18|67.25|44.25|**62.71**|
32
- |[Llama-3-8B-Instruct-80K-QLoRA]()|**43.57**|**43.07**|**28.93**|**69.15**|**48.50**|51.95|
33
 
34
  ## InfiniteBench
35
  We evaluate the model on [InfiniteBench](https://arxiv.org/pdf/2402.13718.pdf) using 80K context length and the official prompt template. The results of GPT4 is copied from the [paper](https://arxiv.org/pdf/2402.13718.pdf). For [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), we use 8K context length.
@@ -88,7 +88,6 @@ base_model = AutoModelForCausalLM.from_pretrained(
88
 
89
  # NOTE: expand rope base
90
  rope_theta=200e6,
91
- max_position_embeddings=81920,
92
  )
93
 
94
  model = PeftModel.from_pretrained(
@@ -119,3 +118,5 @@ with torch.no_grad():
119
  print(f"Answers: {example['answer']}")
120
  print(f"Prediction: {tokenizer.decode(outputs[0])}")
121
  ```
 
 
 
6
  <div align="center">
7
  <h1>Llama-3-8B-Instruct-80K-QLoRA</h1>
8
 
9
+ <a href="https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/">[Data&Code]</a>
10
  </div>
11
 
12
  We extend the context length of Llama-3-8B-Instruct to 80K using QLoRA and 3.5K long-context training data synthesized from GPT-4. The entire training cycle is super efficient, which takes 8 hours on a 8xA800 (80G) machine. Yet, the resulted model achieves remarkable performance on a series of downstream long-context evaluation benchmarks.
 
27
 
28
  |Model|Single-Doc QA|Multi-Doc QA|Summarization|Few-Shot Learning|Synthetic|Code|
29
  |:-:|:-:|:-:|:-:|:-:|:-:|:-:|
30
+ |[meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)|37.33|36.04|26.83|**69.56**|37.75|53.24|
31
  |[gradientai/Llama-3-8B-Instruct-262k](https://huggingface.co/NousResearch/Yarn-Mistral-7b-128k)|37.29|31.20|26.18|67.25|44.25|**62.71**|
32
+ |[Llama-3-8B-Instruct-80K-QLoRA]()|**43.57**|**43.07**|**28.93**|69.15|**48.50**|51.95|
33
 
34
  ## InfiniteBench
35
  We evaluate the model on [InfiniteBench](https://arxiv.org/pdf/2402.13718.pdf) using 80K context length and the official prompt template. The results of GPT4 is copied from the [paper](https://arxiv.org/pdf/2402.13718.pdf). For [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), we use 8K context length.
 
88
 
89
  # NOTE: expand rope base
90
  rope_theta=200e6,
 
91
  )
92
 
93
  model = PeftModel.from_pretrained(
 
118
  print(f"Answers: {example['answer']}")
119
  print(f"Prediction: {tokenizer.decode(outputs[0])}")
120
  ```
121
+ You may observe messages like:
122
+ `This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (8192). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.` or `Setting pad_token_id to eos_token_id:128001 for open-end generation`. They do not matter. Just ignore them.
data/needle.png CHANGED

Git LFS Details

  • SHA256: 3ef5f7561f20bcea38effa1b22488121c8b43f56cdc1f9cce379f271a747fcaa
  • Pointer size: 132 Bytes
  • Size of remote file: 1.47 MB

Git LFS Details

  • SHA256: 259f2e322baf6af1d6121e9f46e6b3c8d6ffaf378b5f838ecf2bbb4df87e78b7
  • Pointer size: 130 Bytes
  • Size of remote file: 70.2 kB
data/topic.png ADDED