File size: 1,024 Bytes
9849f66
 
c704d66
 
 
 
 
9849f66
1e9f631
9849f66
 
 
 
c00e3ae
9849f66
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
---
tags:
- sft
license: other
language:
- en
pipeline_tag: text-generation
---
This is [OpenAssistant's llama2-13b-orca-8k-3319](https://huggingface.co/OpenAssistant/llama2-13b-orca-8k-3319) in a couple of GGML formats.

I had to apply this [workaround](https://huggingface.co/OpenAssistant/oasst-sft-6-llama-30b-xor/discussions/2) to pad the vocab and quantize the models, this may or may not affect performance.<br>
I have no idea what I'm doing so if something doesn't work as it should or at all that's likely on me, not the models themselves.

Below is the suggested prompt format from the original repo:

For the initial response use (e.g. the [llama2 default system prompt](https://github.com/facebookresearch/llama/blob/6c7fe276574e78057f917549435a2554000a876d/llama/generation.py#L46) works well):

```
<|system|>system message</s><|prompter|>user prompt</s><|assistant|>
```
For multi-turn conversations use:
```
<|system|>system message</s><|prompter|>Q1</s><|assistant|>A1</s><|prompter|>Q2</s><|assistant|>
```