huseinzol05
commited on
Commit
•
05288b8
1
Parent(s):
86b3533
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,67 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- ms
|
4 |
+
---
|
5 |
+
|
6 |
+
# 4 bit AWQ QLORA Malaysian Llama2 13B 32k chat completions
|
7 |
+
|
8 |
+
Original model at https://huggingface.co/mesolitica/malaysian-llama2-13b-32k-instructions, read more about AWQ integration at https://huggingface.co/docs/transformers/main_classes/quantization#awq-integration
|
9 |
+
|
10 |
+
## how-to
|
11 |
+
|
12 |
+
```python
|
13 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
|
14 |
+
import torch
|
15 |
+
|
16 |
+
def parse_llama_chat(messages):
|
17 |
+
|
18 |
+
system = messages[0]['content']
|
19 |
+
user_query = messages[-1]['content']
|
20 |
+
|
21 |
+
users, assistants = [], []
|
22 |
+
for q in messages[1:-1]:
|
23 |
+
if q['role'] == 'user':
|
24 |
+
users.append(q['content'])
|
25 |
+
elif q['role'] == 'assistant':
|
26 |
+
assistants.append(q['content'])
|
27 |
+
|
28 |
+
texts = [f'<s>[INST] <<SYS>>\n{system}\n<</SYS>>\n\n']
|
29 |
+
for u, a in zip(users, assistants):
|
30 |
+
texts.append(f'{u.strip()} [/INST] {a.strip()} </s><s>[INST] ')
|
31 |
+
texts.append(f'{user_query.strip()} [/INST]')
|
32 |
+
prompt = ''.join(texts).strip()
|
33 |
+
return prompt
|
34 |
+
|
35 |
+
tokenizer = AutoTokenizer.from_pretrained('mesolitica/malaysian-llama2-13b-32k-instructions-AWS')
|
36 |
+
model = AutoModelForCausalLM.from_pretrained(
|
37 |
+
'mesolitica/malaysian-llama2-13b-32k-instructions-AWQ',
|
38 |
+
use_flash_attention_2 = True,
|
39 |
+
)
|
40 |
+
_ = model.cuda()
|
41 |
+
|
42 |
+
messages = [
|
43 |
+
{'role': 'system', 'content': 'awak adalah AI yang mampu jawab segala soalan'},
|
44 |
+
{'role': 'user', 'content': 'kwsp tu apa'}
|
45 |
+
]
|
46 |
+
prompt = parse_llama_chat(messages)
|
47 |
+
inputs = tokenizer([prompt], return_tensors='pt', add_special_tokens=False).to('cuda')
|
48 |
+
generate_kwargs = dict(
|
49 |
+
inputs,
|
50 |
+
max_new_tokens=1024,
|
51 |
+
top_p=0.95,
|
52 |
+
top_k=50,
|
53 |
+
temperature=0.9,
|
54 |
+
do_sample=True,
|
55 |
+
num_beams=1,
|
56 |
+
)
|
57 |
+
r = model.generate(**generate_kwargs)
|
58 |
+
print(tokenizer.decode(r[0]))
|
59 |
+
```
|
60 |
+
|
61 |
+
```text
|
62 |
+
'<s> [INST] <<SYS>>
|
63 |
+
awak adalah AI yang mampu jawab segala soalan
|
64 |
+
<</SYS>>
|
65 |
+
|
66 |
+
kwsp tu apa [/INST] Kumpulan Wang Simpanan Pekerja (KWSP) ialah sebuah badan berkanun yang ditubuhkan di Malaysia yang menguruskan tabung simpanan tetap pekerja bagi tujuan persaraan dan perancangan masa depan. Diasaskan pada tahun 1951, KWSP bertanggungjawab untuk mengumpul dan menguruskan sumber daripada majikan dan pekerja, dan juga menyediakan pelbagai faedah kepada ahli seperti dividen dan akses kepada pengeluaran simpanan pada usia persaraan. KWSP juga memainkan peranan penting dalam menyediakan perlindungan sosial dan pembangunan ekonomi di Malaysia, dengan mempromosikan simpanan pengguna dan meningkatkan kadar celik kewangan dalam kalangan rakyat. </s>'
|
67 |
+
```
|