m-polignano-uniba commited on
Commit
3d480f6
1 Parent(s): cdffcf2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -24
README.md CHANGED
@@ -24,12 +24,6 @@ license: llama3
24
  <hr>
25
  <!--<img src="https://i.ibb.co/6mHSRm3/llamantino53.jpg" width="200"/>-->
26
 
27
- ## Model Details
28
- *Last Update: 29/04/2024*<br>
29
- *GitHub Link* → [https://github.com/marcopoli/LLaMAntino-3-ANITA](https://github.com/marcopoli/LLaMAntino-3-ANITA)<br>
30
-
31
- <hr>
32
-
33
  **LLaMAntino-3-ANITA-8B-sft-DPO** is a model of the [**LLaMAntino**](https://huggingface.co/swap-uniba) - *Large Language Models family*.
34
  The model is an instruction-tuned version of [**Meta-Llama-3-8b-instruct**](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) (a fine-tuned **LLaMA 3 model**).
35
  This model version aims to be the **Multilingual Base-Model** 🏁 to further fine-tune in the Italian environment.
@@ -40,14 +34,22 @@ wants to provide Italian NLP researchers with an improved model the for Italian
40
 
41
  <hr>
42
 
 
 
 
 
 
 
 
43
  ## Specifications
44
 
45
- - **Model developers**: Ph.D. Marco Polignano - University of Bari Aldo Moro, Italy
46
- - **Variations**: The model release has been **supervised fine-tuning (SFT)** using **QLoRA**, on a long list of instruction-based datasets. **DPO** approach over the *HuggingFaceH4/ultrafeedback_binarized* dataset is used to align with human preferences for helpfulness and safety.
47
  - **Input**: Models input text only.
48
  - **Output**: Models generate text and code only.
49
  - **Model Architecture**: *Llama 3 architecture*.
50
  - **Context length**: 8K, 8192.
 
51
  <hr>
52
 
53
  ## Playground
@@ -74,7 +76,7 @@ For direct use with `transformers`, you can easily get started with the followin
74
  AutoTokenizer,
75
  )
76
 
77
- base_model = "m-polignano-uniba/LLaMAntino-3-ANITA-8B-sft-DPO"
78
  model = AutoModelForCausalLM.from_pretrained(
79
  base_model,
80
  torch_dtype=torch.bfloat16,
@@ -83,8 +85,10 @@ For direct use with `transformers`, you can easily get started with the followin
83
  tokenizer = AutoTokenizer.from_pretrained(base_model)
84
 
85
  messages = [
86
- {"role": "system", "content": "Answer clearly and detailed."},
87
- {"role": "user", "content": "Why is the sky blue ?"}
 
 
88
  ]
89
 
90
  #Method 1
@@ -92,7 +96,7 @@ For direct use with `transformers`, you can easily get started with the followin
92
  inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
93
  for k,v in inputs.items():
94
  inputs[k] = v.cuda()
95
- outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.85, temperature=0.7)
96
  results = tokenizer.batch_decode(outputs)[0]
97
  print(results)
98
 
@@ -104,9 +108,9 @@ For direct use with `transformers`, you can easily get started with the followin
104
  return_full_text=False, # langchain expects the full text
105
  task='text-generation',
106
  max_new_tokens=512, # max number of tokens to generate in the output
107
- temperature=0.7, #temperature for more or less creative answers
108
  do_sample=True,
109
- top_p=0.85,
110
  )
111
 
112
  sequences = pipe(messages)
@@ -125,7 +129,7 @@ For direct use with `transformers`, you can easily get started with the followin
125
  BitsAndBytesConfig,
126
  )
127
 
128
- base_model = "m-polignano-uniba/LLaMAntino-3-ANITA-8B-sft-DPO"
129
  bnb_config = BitsAndBytesConfig(
130
  load_in_4bit=True,
131
  bnb_4bit_quant_type="nf4",
@@ -140,8 +144,10 @@ For direct use with `transformers`, you can easily get started with the followin
140
  tokenizer = AutoTokenizer.from_pretrained(base_model)
141
 
142
  messages = [
143
- {"role": "system", "content": "Answer clearly and detailed."},
144
- {"role": "user", "content": "Why is the sky blue ?"}
 
 
145
  ]
146
 
147
  #Method 1
@@ -149,7 +155,7 @@ For direct use with `transformers`, you can easily get started with the followin
149
  inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
150
  for k,v in inputs.items():
151
  inputs[k] = v.cuda()
152
- outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.85, temperature=0.7)
153
  results = tokenizer.batch_decode(outputs)[0]
154
  print(results)
155
 
@@ -161,9 +167,9 @@ For direct use with `transformers`, you can easily get started with the followin
161
  return_full_text=False, # langchain expects the full text
162
  task='text-generation',
163
  max_new_tokens=512, # max number of tokens to generate in the output
164
- temperature=0.7, #temperature for more or less creative answers
165
  do_sample=True,
166
- top_p=0.85,
167
  )
168
 
169
  sequences = pipe(messages)
@@ -187,7 +193,7 @@ For direct use with `unsloth`, you can easily get started with the following ste
187
  from unsloth import FastLanguageModel
188
  import torch
189
 
190
- base_model = "m-polignano-uniba/LLaMAntino-3-ANITA-8B-sft-DPO"
191
  model, tokenizer = FastLanguageModel.from_pretrained(
192
  model_name = base_model,
193
  max_seq_length = 8192,
@@ -200,14 +206,16 @@ For direct use with `unsloth`, you can easily get started with the following ste
200
  - Right now, you can start using the model directly.
201
  ```python
202
  messages = [
203
- {"role": "system", "content": "Answer clearly and detailed."},
204
- {"role": "user", "content": "Why is the sky blue ?"}
 
 
205
  ]
206
  prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
207
  inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
208
  for k,v in inputs.items():
209
  inputs[k] = v.cuda()
210
- outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.85, temperature=0.7)
211
  results = tokenizer.batch_decode(outputs)[0]
212
  print(results)
213
  ```
 
24
  <hr>
25
  <!--<img src="https://i.ibb.co/6mHSRm3/llamantino53.jpg" width="200"/>-->
26
 
 
 
 
 
 
 
27
  **LLaMAntino-3-ANITA-8B-sft-DPO** is a model of the [**LLaMAntino**](https://huggingface.co/swap-uniba) - *Large Language Models family*.
28
  The model is an instruction-tuned version of [**Meta-Llama-3-8b-instruct**](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) (a fine-tuned **LLaMA 3 model**).
29
  This model version aims to be the **Multilingual Base-Model** 🏁 to further fine-tune in the Italian environment.
 
34
 
35
  <hr>
36
 
37
+ ## Model Details
38
+ *Last Update: 10/05/2024*<br>
39
+
40
+ <img src="https://static.vecteezy.com/system/resources/previews/016/833/880/large_2x/github-logo-git-hub-icon-with-text-on-white-background-free-vector.jpg" width="200"> [https://github.com/marcopoli/LLaMAntino-3-ANITA](https://github.com/marcopoli/LLaMAntino-3-ANITA)<br>
41
+
42
+ <hr>
43
+
44
  ## Specifications
45
 
46
+ - **Model developers**: Ph.D. Marco Polignano - University of Bari Aldo Moro, Italy - SWAP Research Group
47
+ - **Variations**: The model release has been **supervised fine-tuning (SFT)** using **QLoRA** 4bit, on two instruction-based datasets. **DPO** approach over the *jondurbin/truthy-dpo-v0.1* dataset is used to align with human preferences for helpfulness and safety.
48
  - **Input**: Models input text only.
49
  - **Output**: Models generate text and code only.
50
  - **Model Architecture**: *Llama 3 architecture*.
51
  - **Context length**: 8K, 8192.
52
+ - **Library Used**: [Unsloth](https://unsloth.ai/)
53
  <hr>
54
 
55
  ## Playground
 
76
  AutoTokenizer,
77
  )
78
 
79
+ base_model = "m-polignano-uniba/LLaMAntino-3-ANITA-8B-Instr-DPO-ITA"
80
  model = AutoModelForCausalLM.from_pretrained(
81
  base_model,
82
  torch_dtype=torch.bfloat16,
 
85
  tokenizer = AutoTokenizer.from_pretrained(base_model)
86
 
87
  messages = [
88
+ {"role": "system", "content": {"role": "system", "content": "Sei un an assistente AI per la lingua Italiana di nome LLaMAntino-3 ANITA \
89
+ (Advanced Natural-based interaction for the ITAlian language). \
90
+ Rispondi nella lingua usata per la domanda in modo chiaro, semplice ed esaustivo. "},
91
+ {"role": "user", "content": "Why is the sky blue?"}
92
  ]
93
 
94
  #Method 1
 
96
  inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
97
  for k,v in inputs.items():
98
  inputs[k] = v.cuda()
99
+ outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.9, temperature=0.6)
100
  results = tokenizer.batch_decode(outputs)[0]
101
  print(results)
102
 
 
108
  return_full_text=False, # langchain expects the full text
109
  task='text-generation',
110
  max_new_tokens=512, # max number of tokens to generate in the output
111
+ temperature=0.6, #temperature for more or less creative answers
112
  do_sample=True,
113
+ top_p=0.9,
114
  )
115
 
116
  sequences = pipe(messages)
 
129
  BitsAndBytesConfig,
130
  )
131
 
132
+ base_model = "m-polignano-uniba/LLaMAntino-3-ANITA-8B-Instr-DPO-ITA"
133
  bnb_config = BitsAndBytesConfig(
134
  load_in_4bit=True,
135
  bnb_4bit_quant_type="nf4",
 
144
  tokenizer = AutoTokenizer.from_pretrained(base_model)
145
 
146
  messages = [
147
+ {"role": "system", "content": {"role": "system", "content": "Sei un an assistente AI per la lingua Italiana di nome LLaMAntino-3 ANITA \
148
+ (Advanced Natural-based interaction for the ITAlian language). \
149
+ Rispondi nella lingua usata per la domanda in modo chiaro, semplice ed esaustivo. "},
150
+ {"role": "user", "content": "Why is the sky blue?"}
151
  ]
152
 
153
  #Method 1
 
155
  inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
156
  for k,v in inputs.items():
157
  inputs[k] = v.cuda()
158
+ outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.9, temperature=0.6)
159
  results = tokenizer.batch_decode(outputs)[0]
160
  print(results)
161
 
 
167
  return_full_text=False, # langchain expects the full text
168
  task='text-generation',
169
  max_new_tokens=512, # max number of tokens to generate in the output
170
+ temperature=0.6, #temperature for more or less creative answers
171
  do_sample=True,
172
+ top_p=0.9,
173
  )
174
 
175
  sequences = pipe(messages)
 
193
  from unsloth import FastLanguageModel
194
  import torch
195
 
196
+ base_model = "m-polignano-uniba/LLaMAntino-3-ANITA-8B-Instr-DPO-ITA"
197
  model, tokenizer = FastLanguageModel.from_pretrained(
198
  model_name = base_model,
199
  max_seq_length = 8192,
 
206
  - Right now, you can start using the model directly.
207
  ```python
208
  messages = [
209
+ {"role": "system", "content": {"role": "system", "content": "Sei un an assistente AI per la lingua Italiana di nome LLaMAntino-3 ANITA \
210
+ (Advanced Natural-based interaction for the ITAlian language). \
211
+ Rispondi nella lingua usata per la domanda in modo chiaro, semplice ed esaustivo. "},
212
+ {"role": "user", "content": "Why is the sky blue?"}
213
  ]
214
  prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
215
  inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
216
  for k,v in inputs.items():
217
  inputs[k] = v.cuda()
218
+ outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.9, temperature=0.6)
219
  results = tokenizer.batch_decode(outputs)[0]
220
  print(results)
221
  ```