|
--- |
|
license: mit |
|
datasets: |
|
- Reza8848/MUFFIN_68k |
|
--- |
|
|
|
|
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/6434a6e8ea46c009904c617e/J_4FHXmtM6TuRnN3aL06y.png" width="38" height="38"> |
|
|
|
This is the Llama2 LoRA weight that was fine-tuned on **MUFFIN** (**Mu**lti-**F**aceted **In**structions). |
|
|
|
We fine-tune the [Llama2-13B](https://huggingface.co/meta-llama/Llama-2-13b-hf) on [MUFFIN dataset](https://arxiv.org/abs/2312.02436) with LoRA (low-rank adaption). |
|
|
|
We released the LoRA weights of both Llama2 7B and 13B models: |
|
|Model|LoRA Target Modules| |
|
|-|-| |
|
|[MUFFIN-Llama2-7B](https://huggingface.co/Reza8848/MUFFIN-Llama2-lora-7B)|`Q, K, V, O`| |
|
|[MUFFIN-Llama2-13B](https://huggingface.co/Reza8848/MUFFIN-Llama2-lora-13B)|`Q, K, V, O`| |
|
|
|
You can also find the T5-based models [here](https://huggingface.co/Reza8848/MUFFIN-T5-3B). |
|
|
|
|
|
## Model Usage |
|
|
|
### 1. Inference code |
|
|
|
We use [Alpaca-lora](https://github.com/tloen/alpaca-lora) as our fine-tuning code. |
|
|
|
So, when adopting the released model weights for inference, it should be better to use the [generation code](https://github.com/tloen/alpaca-lora/blob/main/generate.py) of Alpaca-lora to reproduce our performance. |
|
|
|
|
|
Please follow the document of Alpaca-lora to set up the **correct Python environments first**. |
|
|
|
|
|
> Our released lora weights are in **`.safetensors`** format rather than the common **`.bin`** torch model files. |
|
> Wrong transformers and torch versions may result in [PEFT compatibility errors](https://github.com/huggingface/transformers/issues/27397) when using the released lora weighs. |
|
|
|
|
|
### 2. Prompt template |
|
|
|
Please use the following prompt template (save the following dict as a JSON file under ['template' folder](https://github.com/tloen/alpaca-lora/tree/main/templates)): |
|
|
|
```json |
|
{ |
|
"description": "Template used by muffin.", |
|
"prompt_input": "### Input:\n{input}\n\n### Instruction:\n{instruction}\n\n### Response:\n", |
|
"prompt_no_input": "### Input:\nNone\n\n### Instruction:\n{instruction}\n\n### Response:\n", |
|
"response_split": "### Response:" |
|
} |
|
``` |
|
|
|
### 3. Generation hyper-parameters |
|
|
|
We use the default generation hyper-parameters as identified in [this line](https://github.com/tloen/alpaca-lora/blob/main/generate.py#L90). |
|
|
|
Besides, be aware of the following hyper-parameters: |
|
- `max_input_len == 1024`. This is the max_input_len of training. But it's fine to use any length in the inference since our evaluation batch size is 1. |
|
- `num_beams == 1`. In our experiments, we set beam size to 1. But we recommend you try with a larger beam size to get better responses from models. |
|
- When doing batched inference, please make sure `tokenizer.padding_side = "left" `, as we left padded all the batched instances when doing tuning (though it shall not have a big impact on the inference results). |
|
|
|
## Zero-Shot Evaluation Performances |
|
|
|
We use the [metric calculation scripts](https://github.com/yizhongw/Tk-Instruct/blob/main/src/compute_metrics.py) of [Tk-Instruct](https://github.com/yizhongw/Tk-Instruct/tree/main) (i.e., `ROUGE-L` and `Exact-Match`). |
|
|
|
<div style="text-align:center"><img src="https://cdn-uploads.huggingface.co/production/uploads/6434a6e8ea46c009904c617e/IjeMYWLMRO_qGOOiXxemP.png" alt="performances.png" width="600"/></div> |
|
|
|
|
|
|
|
## 🥳 Citation |
|
|
|
Please kindly cite our paper if you use any resources in this repository: |
|
|
|
```bibtex |
|
@inproceedings{Lou2023MUFFIN, |
|
title={{MUFFIN}: Curating Multi-Faceted Instructions for Improving Instruction Following}, |
|
author={Renze Lou and Kai Zhang and Jian Xie and Yuxuan Sun and Janice Ahn and Hanzi Xu and Yu su and Wenpeng Yin}, |
|
booktitle={The Twelfth International Conference on Learning Representations}, |
|
year={2024}, |
|
url={https://openreview.net/forum?id=1vrS1zwekw} |
|
} |
|
``` |