Text Generation
Transformers
PyTorch
English
llama
text-generation-inference
Inference Endpoints
File size: 400 Bytes
3c999f9
 
 
276f3c7
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
---
license: llama2
---

Usage: 

1. [Install OpenChat](https://github.com/imoneoi/openchat/#installation)

2. `python -m ochat.serving.openai_api_server --model-type openchat_llama2 --model Open-Orca/Llama2_GPT4_1M --engine-use-ray --worker-use-ray --max-num-batched-tokens 5120`

To use features such as tensor parallelism on consumer GPUs, API keys and logging, follow the OpenChat documentation.