Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
datasets:
|
3 |
+
- Oniichat/bluemoon_roleplay_chat_data_300k_messages
|
4 |
+
inference: false
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
license: llama2
|
8 |
+
model_creator: PygmalionAI
|
9 |
+
model_link: https://huggingface.co/PygmalionAI/mythalion-13b
|
10 |
+
model_name: mythalion-13b
|
11 |
+
model_type: llama
|
12 |
+
pipeline_tag: text-generation
|
13 |
+
quantized_by: Eigeen
|
14 |
+
tags:
|
15 |
+
- text generation
|
16 |
+
- instruct
|
17 |
+
thumbnail: null
|
18 |
+
---
|
19 |
+
|
20 |
+
# Mythalion 13B - ExLlamaV2
|
21 |
+
|
22 |
+
Original model: [mythalion-13b](https://huggingface.co/PygmalionAI/mythalion-13b)
|
23 |
+
|
24 |
+
# Description
|
25 |
+
|
26 |
+
This is my trial of quantization. I use only RP dataset for calibration, it may cause the model to not perform as well in other situations. But people who use Mythalion basically use it for RP, I guess?
|
27 |
+
|
28 |
+
Anyway, it works on RP. I haven't tested it's performance in other situations. ExLlamaV2 is great.
|
29 |
+
|
30 |
+
2.30 bpw is designed for 8GB VRAM. It is more extreme and can only up to 2048 context. If your VRAM is occupied by other program or system, lower the allowed context.
|
31 |
+
|
32 |
+
I wouldn't use it though because its performance so poor compared to 4 and 6bpw. It's just can work. I shared it and maybe someone needed it.
|