swap-uniba
/

LLaMAntino-2-chat-7b-hf-UltraChat-ITA

@@ -5,7 +5,13 @@ language:
 tags:
 - text-generation-inference
 ---
 # Model Card for LLaMAntino-2-chat-7b-UltraChat-ITA
 ## Model description
@@ -33,7 +39,7 @@ If you are interested in more details regarding the training procedure, you can
 This prompt format based on the [LLaMA 2 prompt template](https://gpus.llm-utils.org/llama-2-prompt-template/) adapted to the italian language was used:
 ```python
-"<s>[INST] <<SYS>>\n" \
 "Sei un assistente disponibile, rispettoso e onesto. " \
 "Rispondi sempre nel modo piu' utile possibile, pur essendo sicuro. " \
 "Le risposte non devono includere contenuti dannosi, non etici, razzisti, sessisti, tossici, pericolosi o illegali. " \
@@ -41,7 +47,7 @@ This prompt format based on the [LLaMA 2 prompt template](https://gpus.llm-utils
 "Se una domanda non ha senso o non e' coerente con i fatti, spiegane il motivo invece di rispondere in modo non corretto. " \
 "Se non conosci la risposta a una domanda, non condividere informazioni false.\n" \
 "<</SYS>>\n\n" \
-f"{user_msg_1} [/INST] {model_answer_1} </s><s>[INST] {user_msg_2} [/INST] {model_answer_2} </s> ... <s>[INST] {user_msg_N} [/INST] {model_answer_N} </s> "
 ```
 We recommend using the same prompt in inference to obtain the best results!
@@ -60,7 +66,7 @@ model = AutoModelForCausalLM.from_pretrained(model_id)
 user_msg = "Ciao! Come stai?"
-prompt = "<s>[INST] <<SYS>>\n" \
          "Sei un assistente disponibile, rispettoso e onesto. " \
          "Rispondi sempre nel modo piu' utile possibile, pur essendo sicuro. " \
          "Le risposte non devono includere contenuti dannosi, non etici, razzisti, sessisti, tossici, pericolosi o illegali. " \
@@ -68,22 +74,38 @@ prompt = "<s>[INST] <<SYS>>\n" \
          "Se una domanda non ha senso o non e' coerente con i fatti, spiegane il motivo invece di rispondere in modo non corretto. " \
          "Se non conosci la risposta a una domanda, non condividere informazioni false.\n" \
          "<</SYS>>\n\n" \
-         f"{user_msg} [/INST] "
 input_ids = tokenizer(prompt, return_tensors="pt").input_ids
-outputs = model.generate(input_ids=input_ids, max_length=1024)
 print(tokenizer.batch_decode(outputs.detach().cpu().numpy()[:, input_ids.shape[1]:], skip_special_tokens=True)[0])
 ```
-If you are facing issues when loading the model, you can try to load it quantized:
 ```python
 model = AutoModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)
 ```
-*Note*: The model loading strategy above requires the [*bitsandbytes*](https://pypi.org/project/bitsandbytes/) and [*accelerate*](https://pypi.org/project/accelerate/) libraries
 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
@@ -105,4 +127,6 @@ If you use this model in your research, please cite the following:
       archivePrefix={arXiv},
       primaryClass={cs.CL}
 }
-```

 tags:
 - text-generation-inference
 ---
+<img src="https://i.ibb.co/6mHSRm3/llamantino53.jpg" alt="llamantino53" border="0" width="200px">
 # Model Card for LLaMAntino-2-chat-7b-UltraChat-ITA
+*Last Update: 08/01/2024*<br>*Example of Use*: [Colab Notebook](https://colab.research.google.com/drive/1lCQ7MqSNKILsIncNYhdN_yqzSvl4akat?usp=sharing)
+<hr>
 ## Model description
 This prompt format based on the [LLaMA 2 prompt template](https://gpus.llm-utils.org/llama-2-prompt-template/) adapted to the italian language was used:
 ```python
+" [INST]<<SYS>>\n" \
 "Sei un assistente disponibile, rispettoso e onesto. " \
 "Rispondi sempre nel modo piu' utile possibile, pur essendo sicuro. " \
 "Le risposte non devono includere contenuti dannosi, non etici, razzisti, sessisti, tossici, pericolosi o illegali. " \
 "Se una domanda non ha senso o non e' coerente con i fatti, spiegane il motivo invece di rispondere in modo non corretto. " \
 "Se non conosci la risposta a una domanda, non condividere informazioni false.\n" \
 "<</SYS>>\n\n" \
+f"{user_msg_1}[/INST] {model_answer_1} </s> <s> [INST]{user_msg_2}[/INST] {model_answer_2} </s> ... <s> [INST]{user_msg_N}[/INST] {model_answer_N} </s>"
 ```
 We recommend using the same prompt in inference to obtain the best results!
 user_msg = "Ciao! Come stai?"
+prompt = " [INST]<<SYS>>\n" \
          "Sei un assistente disponibile, rispettoso e onesto. " \
          "Rispondi sempre nel modo piu' utile possibile, pur essendo sicuro. " \
          "Le risposte non devono includere contenuti dannosi, non etici, razzisti, sessisti, tossici, pericolosi o illegali. " \
          "Se una domanda non ha senso o non e' coerente con i fatti, spiegane il motivo invece di rispondere in modo non corretto. " \
          "Se non conosci la risposta a una domanda, non condividere informazioni false.\n" \
          "<</SYS>>\n\n" \
+         f"{user_msg}[/INST]"
+pipe = transformers.pipeline(
+    model=model,
+    tokenizer=tokenizer,
+    return_full_text=False, # langchain expects the full text
+    task='text-generation',
+    max_new_tokens=512, # max number of tokens to generate in the output
+    temperature=0.8  #temperature for more or less creative answers
+)
+# Method 1
+sequences = pipe(text)
+for seq in sequences:
+    print(f"{seq['generated_text']}")
+# Method 2
 input_ids = tokenizer(prompt, return_tensors="pt").input_ids
+outputs = model.generate(input_ids=input_ids, max_length=512)
 print(tokenizer.batch_decode(outputs.detach().cpu().numpy()[:, input_ids.shape[1]:], skip_special_tokens=True)[0])
 ```
+If you are facing issues when loading the model, you can try to load it **Quantized**:
 ```python
 model = AutoModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)
 ```
+*Note*:
+1) The model loading strategy above requires the [*bitsandbytes*](https://pypi.org/project/bitsandbytes/) and [*accelerate*](https://pypi.org/project/accelerate/) libraries
+2) The Tokenizer, by default, adds at the beginning of the prompt the '\<BOS\>' token. If that is not the case, add as a starting token the *\<s\>* string.
 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
       archivePrefix={arXiv},
       primaryClass={cs.CL}
 }
+```
+*Notice:* Llama 2 is licensed under the LLAMA 2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved. [*License*](https://ai.meta.com/llama/license/)