Open-Orca
/

OpenOrcaxOpenChat-Preview2-13B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

bleysg commited on Aug 3, 2023

Commit

8876aa6

•

1 Parent(s): 843efc8

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -147,6 +147,13 @@ This model is most easily served with [OpenChat's](https://github.com/imoneoi/op
 This is highly recommended as it is by far the fastest in terms of inference speed and is a quick and easy option for setup.
 We also illustrate setup of Oobabooga/text-generation-webui below. The settings outlined there will also apply to other uses of `Transformers`.
 ## Serving with OpenChat

 This is highly recommended as it is by far the fastest in terms of inference speed and is a quick and easy option for setup.
 We also illustrate setup of Oobabooga/text-generation-webui below. The settings outlined there will also apply to other uses of `Transformers`.
+## Serving Quantized
+Pre-quantized models are now available courtesy of our friend TheBloke:
+* **GGML**: https://huggingface.co/TheBloke/OpenOrcaxOpenChat-Preview2-13B-GGML
+* **GPTQ**: https://huggingface.co/TheBloke/OpenOrcaxOpenChat-Preview2-13B-GPTQ
 ## Serving with OpenChat