mlabonne
/

llama-2-7b-guanaco

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mlabonne commited on Jul 29, 2023

Commit

75e7737

•

1 Parent(s): 32b92e4

Update README.md

Files changed (1) hide show

README.md +47 -6

README.md CHANGED Viewed

@@ -4,18 +4,59 @@ datasets:
 - timdettmers/openassistant-guanaco
 pipeline_tag: text-generation
 ---
 📝 [Article](https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32) |
 💻 [Colab](https://colab.research.google.com/drive/1PEQyJO1-f6j0S_XJ8DV50NkpzasXkrzd?usp=sharing)
-This is a Llama 2-7b model QLoRA fine-tuned (4-bit precision) on the [`mlabonne/guanaco-llama2-1k`](https://huggingface.co/datasets/mlabonne/guanaco-llama2) dataset.
-It was trained on a Google Colab notebook with a T4 GPU and high RAM. It is mainly designed for educational purposes, not for inference.
-You can easily import it using the `AutoModelForCausalLM` class from `transformers`:
 ```
-from transformers import AutoModelForCausalLM
-model = AutoModelForCausalLM("mlabonne/llama-2-7b-miniguanaco")
-```

 - timdettmers/openassistant-guanaco
 pipeline_tag: text-generation
 ---
+# Llama-2-7b-guanaco
 📝 [Article](https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32) |
 💻 [Colab](https://colab.research.google.com/drive/1PEQyJO1-f6j0S_XJ8DV50NkpzasXkrzd?usp=sharing)
+<center><img src="https://i.imgur.com/C2x7n2a.png" width="300"></center>
+This is a Llama 2-7b model QLoRA fine-tuned (4-bit precision) on the [`mlabonne/guanaco-llama2`](https://huggingface.co/datasets/mlabonne/guanaco-llama2) dataset.
+## 🔧 Training
+It was trained on a Google Colab notebook with a T4 GPU and high RAM.
+## 💻 Usage
+``` python
+# pip install transformers accelerate
+from transformers import AutoTokenizer
+import transformers
+import torch
+model = "mlabonne/llama-2-7b-miniguanaco"
+prompt = "What is a large language model?"
+tokenizer = AutoTokenizer.from_pretrained(model)
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model,
+    torch_dtype=torch.float16,
+    device_map="auto",
+)
+sequences = pipeline(
+    f'<s>[INST] {prompt} [/INST]',
+    do_sample=True,
+    top_k=10,
+    num_return_sequences=1,
+    eos_token_id=tokenizer.eos_token_id,
+    max_length=200,
+)
+for seq in sequences:
+    print(f"Result: {seq['generated_text']}")
 ```
+Output:
+>A large language model is a type of artificial intelligence (AI) model that is trained to generate human-like language.Љe models can be trained on text from a specific genre, such as news articles, or on a large corpus of text, such as the internet. They can then be used to generate text, such as articles, stories or even entire books. These models are often used in applications such as chatbots, language translation and content generation. They have been used to write books such as: "The Last Days of New Paris" by China Miéville.
+>
+>The large models are also used for many other applications such as:
+>
+>- Translation
+>- Summarization
+>- Sentiment Analysis
+>- Text classification
+>- Generative writing (creates articles, stories, and more.)
+>- Conversational language generation.