Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,15 @@
|
|
1 |
---
|
2 |
inference: false
|
|
|
|
|
|
|
|
|
3 |
license: other
|
|
|
|
|
|
|
|
|
|
|
4 |
---
|
5 |
|
6 |
<!-- header start -->
|
@@ -29,6 +38,16 @@ It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com
|
|
29 |
* [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/GPlatty-30B-GGML)
|
30 |
* [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/lilloukas/GPlatty-30B)
|
31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
## How to easily download and use this model in text-generation-webui
|
33 |
|
34 |
Please make sure you're using the latest version of text-generation-webui
|
|
|
1 |
---
|
2 |
inference: false
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
tags:
|
6 |
+
- llama
|
7 |
license: other
|
8 |
+
metrics:
|
9 |
+
- MMLU
|
10 |
+
- ARC
|
11 |
+
- HellaSwag
|
12 |
+
- TruthfulQA
|
13 |
---
|
14 |
|
15 |
<!-- header start -->
|
|
|
38 |
* [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/GPlatty-30B-GGML)
|
39 |
* [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/lilloukas/GPlatty-30B)
|
40 |
|
41 |
+
## Prompt template
|
42 |
+
|
43 |
+
```
|
44 |
+
Below is an instruction that describes a task. Write a response that appropriately completes the request
|
45 |
+
|
46 |
+
### Instruction: prompt
|
47 |
+
|
48 |
+
### Response:
|
49 |
+
```
|
50 |
+
|
51 |
## How to easily download and use this model in text-generation-webui
|
52 |
|
53 |
Please make sure you're using the latest version of text-generation-webui
|