|
--- |
|
datasets: |
|
- tatsu-lab/alpaca |
|
language: |
|
- en |
|
pipeline_tag: text2text-generation |
|
library_name: transformers |
|
license: other |
|
--- |
|
|
|
|
|
# Model Details |
|
|
|
- **Model name:** Flan-T5-Large-Alpaca |
|
- **Model type:** - Text2Text Generation |
|
- **Parent Model:** [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) |
|
- **Training dataset:** [Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca) |
|
- **Language:** English |
|
- **Framework:** PyTorch |
|
- **Model version:** 1.0 |
|
|
|
|
|
We take the instruction-tuned Flan models (trained on Academic datasets) and perform style transfer using the Alpaca dataset |
|
|
|
# License |
|
- Parent model ([google/flan-t5-large](https://huggingface.co/google/flan-t5-large)): Apache 2.0 |
|
- Dataset ([Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca)) : cc-by-4.0 |
|
- Text-Davinci-3 (Used to generate Alpaca): [OpenAI License](https://openai.com/policies/terms-of-use) |
|
|
|
|
|
|
|
# How to Use |
|
|
|
``` |
|
from transformers import pipeline |
|
model = pipeline(model="vmware/flan-t5-large-alpaca",device_map = 'auto') |
|
|
|
prompt = "Give me the recipe for making a caramel flan" |
|
output = model(prompt, max_length=256, do_sample=True) |
|
|
|
|
|
|
|
|
|
''' |
|
[{'generated_text': 'Recipe for making caramel flan: 2 cups all-purpose flour 3 cups butter 6 tablespoons sugar 2 tablespoons melted dark chocolate 2 cups milk Instructions: 1. |
|
Preheat oven to 350°F (180°C). 2. Grease 9 inch round cake pan. 3. In a large bowl, whisk together flour, baking powder, cocoa powder, salt and salt. |
|
4. Beat until fluffy. 5. Add the melted chocolate, vanilla and sugar and beat until blended. 6. Pour batter into the prepared pan and bake for 45 minutes. |
|
7. Remove the pan from the oven and let cool before serving.'}] |
|
|
|
|
|
''' |
|
|
|
|
|
``` |
|
|
|
|
|
Using Alpaca prompt template might generate better outputs for certain prompts as the model was trained using the template. |
|
|
|
``` |
|
from transformers import pipeline |
|
model = pipeline(model="vmware/flan-t5-large-alpaca",device_map = 'auto') |
|
|
|
prompt_template = "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:" |
|
|
|
prompt = "YOUR PROMPT HERE" |
|
|
|
output = model(prompt_template.format(instruction= prompt), max_length=256, do_sample=True) |
|
|
|
print(output) |
|
|
|
|
|
|
|
``` |
|
|
|
# Training Details |
|
|
|
The model was trained on 3xV100 GPUs |
|
|
|
* Hyperparameters: |
|
* learning_rate = 5e-5 |
|
* batch_size = 128 |
|
* epochs = 3 |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
# Limitations and Bias |
|
|
|
The model is based on a large and diverse dataset, but it may still have limitations and biases in certain areas. Some limitations include: |
|
|
|
- Language: The model is designed to work with English text only and may not perform as well in other languages. |
|
|
|
|
|
In addition, the model may have some bias in terms of the data it was trained on. The dataset includes questions from a variety of sources, but it may not be representative of all populations or perspectives. As a result, the model may perform better or worse for certain types of questions or on certain types of texts. |