Qwen
/

Qwen2-7B-Instruct-GPTQ-Int4

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

JustinLin610 commited on Jun 9

Commit

4474f87

•

1 Parent(s): 4efcaf2

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ Qwen2-7B-Instruct-GPTQ-Int4 supports a context length of up to 131,072 tokens, e
 For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2/), [GitHub](https://github.com/QwenLM/Qwen2), and [Documentation](https://qwen.readthedocs.io/en/latest/).
-**Note**: If you encounter ``RuntimeError: probability tensor contains either `inf`, `nan` or element < 0`` during inference with ``transformer``, we recommand installing ``autogpq>=0.7.1`` or [deploying this model with vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html).
 <br>
 ## Model Details

 For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2/), [GitHub](https://github.com/QwenLM/Qwen2), and [Documentation](https://qwen.readthedocs.io/en/latest/).
+**Note**: If you encounter ``RuntimeError: probability tensor contains either `inf`, `nan` or element < 0`` during inference with ``transformers``, we recommand installing ``autogpq>=0.7.1`` or [deploying this model with vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html).
 <br>
 ## Model Details