webpolis
/

zenos-gpt-j-6B-instruct-4bit

Text Generation

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Nicolas Iglesias commited on Sep 25, 2023

Commit

e93a8ff

•

1 Parent(s): 7d46bfc

Update README.md

Files changed (1) hide show

README.md +63 -8

README.md CHANGED Viewed

@@ -1,8 +1,63 @@
----
-license: apache-2.0
-datasets:
-- bertin-project/alpaca-spanish
-language:
-- es
-library_name: transformers
----

+# Zenos GPT-J 6B Alpaca-Evol 4-bit
+## Model Overview
+- **Name:** zenos-gpt-j-6B-alpaca-evol-4bit
+- **Datasets Used:** [Alpaca Spanish](https://huggingface.co/datasets/bertin-project/alpaca-spanish), [Evol Instruct](https://huggingface.co/datasets/FreedomIntelligence/evol-instruct-spanish)
+- **Architecture:** GPT-J
+- **Model Size:** 6 Billion parameters
+- **Precision:** 4 bits
+- **Fine-tuning:** This model was fine-tuned using Low-Rank Adaptation (LoRa).
+- **Content Moderation:** This model is not moderated.
+## Description
+Zenos GPT-J 6B Alpaca Evol 4-bit is a Spanish Instruction capable model based on the GPT-J architecture with 6 billion parameters. It has been fine-tuned on the Alpaca Spanish and Evol Instruct datasets, making it particularly suitable for natural language understanding and generation tasks in Spanish.
+## Usage
+You can use this model for various natural language processing tasks such as text generation, translation, summarization, and more. Below is an example of how to use it in Python with the Transformers library:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
+# Load the tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained("zenos-gpt-j-6B-alpaca-evol-4bit")
+model = AutoModelForCausalLM.from_pretrained("zenos-gpt-j-6B-alpaca-evol-4bit")
+# Generate text
+prompt = 'A continuación hay una instrucción que describe una tarea. Escribe una respuesta que complete adecuadamente lo que se pide.\n\n### Instrucción:\nEscribe un poema breve usando cuatro estrofas\n\n### Respuesta:\n'
+inputs = tokenizer(prompt, return_tensors="pt")
+input_ids = inputs["input_ids"].to(model.device)
+attention_mask = inputs["attention_mask"].to(model.device)
+generation_config = GenerationConfig(
+    temperature=0.1,
+    top_p=0.75,
+    top_k=40,
+    num_beams=1,
+    repetition_penalty=1.5,
+    do_sample=True,
+)
+with torch.no_grad():
+    generation_output = model.generate(
+        input_ids=input_ids,
+        pad_token_id=tokenizer.eos_token_id,
+        attention_mask=attention_mask,
+        generation_config=generation_config,
+        return_dict_in_generate=True,
+        output_scores=False,
+        max_new_tokens=512,
+        early_stopping=True
+    )
+s = generation_output.sequences[0]
+output = tokenizer.decode(s)
+start_txt = output.find('### Respuesta:\n') + len('### Respuesta:\n')
+end_txt = output.find("<|endoftext|>", start_txt)
+answer = output[start_txt:end_txt]
+print(answer)
+```