File size: 1,024 Bytes
9849f66 c704d66 9849f66 1e9f631 9849f66 c00e3ae 9849f66 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
---
tags:
- sft
license: other
language:
- en
pipeline_tag: text-generation
---
This is [OpenAssistant's llama2-13b-orca-8k-3319](https://huggingface.co/OpenAssistant/llama2-13b-orca-8k-3319) in a couple of GGML formats.
I had to apply this [workaround](https://huggingface.co/OpenAssistant/oasst-sft-6-llama-30b-xor/discussions/2) to pad the vocab and quantize the models, this may or may not affect performance.<br>
I have no idea what I'm doing so if something doesn't work as it should or at all that's likely on me, not the models themselves.
Below is the suggested prompt format from the original repo:
For the initial response use (e.g. the [llama2 default system prompt](https://github.com/facebookresearch/llama/blob/6c7fe276574e78057f917549435a2554000a876d/llama/generation.py#L46) works well):
```
<|system|>system message</s><|prompter|>user prompt</s><|assistant|>
```
For multi-turn conversations use:
```
<|system|>system message</s><|prompter|>Q1</s><|assistant|>A1</s><|prompter|>Q2</s><|assistant|>
``` |