Reza8848/MUFFIN-Llama2-lora-7B

This is the Llama2 LoRA weight that was fine-tuned on MUFFIN (Multi-Faceted Instructions).

We fine-tune the Llama2-7B on MUFFIN dataset with LoRA (low-rank adaption).

We released the LoRA weights of both Llama2 7B and 13B models:

Model	LoRA Target Modules
MUFFIN-Llama2-7B	`Q, K, V, O`
MUFFIN-Llama2-13B	`Q, K, V, O`

Model Usage

1. Inference code

We use Alpaca-lora as our fine-tuning code.

So, when adopting the released model weights for inference, it should be better to use the generation code of Alpaca-lora to reproduce our performance.

Please follow the document of Alpaca-lora to set up the correct Python environments first.

Our released lora weights are in .safetensors format rather than the common .bin torch model files. Wrong transformers and torch versions may result in PEFT compatibility errors when using the released lora weighs.

2. Prompt template

Please use the following prompt template (save the following dict as a JSON file under 'template' folder):

{
    "description": "Template used by muffin.",
    "prompt_input": "### Input:\n{input}\n\n### Instruction:\n{instruction}\n\n### Response:\n",
    "prompt_no_input": "### Input:\nNone\n\n### Instruction:\n{instruction}\n\n### Response:\n",
    "response_split": "### Response:"    
}

3. Generation hyper-parameters

We use the default generation hyper-parameters as identified in this line.

Besides, be aware of the following hyper-parameters:

max_input_len == 1024. This is the max_input_len of training. But it's fine to use any length in the inference since our evaluation batch size is 1.
num_beams == 1. In our experiments, we set beam size to 1. But we recommend you try with a larger beam size to get better responses from models.
When doing batched inference, please make sure tokenizer.padding_side = "left" , as we left padded all the batched instances when doing tuning (though it shall not have a big impact on the inference results).

🥳 Citation

Please kindly cite our paper if you use any resources in this repository:

@inproceedings{Lou2023MUFFIN,
   title={{MUFFIN}: Curating Multi-Faceted Instructions for Improving Instruction Following},
   author={Renze Lou and Kai Zhang and Jian Xie and Yuxuan Sun and Janice Ahn and Hanzi Xu and Yu su and Wenpeng Yin},
   booktitle={The Twelfth International Conference on Learning Representations},
   year={2024},
   url={https://openreview.net/forum?id=1vrS1zwekw}
}

Reza8848
/

MUFFIN-Llama2-lora-7B

Model Usage

1. Inference code

2. Prompt template

3. Generation hyper-parameters

🥳 Citation

Dataset used to train Reza8848/MUFFIN-Llama2-lora-7B