Reza8848
/

MUFFIN-Llama2-lora-13B

Model card Files Files and versions Community

Reza8848 commited on Mar 11

Commit

c120aaf

•

1 Parent(s): ee9df89

Update README.md

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -52,10 +52,9 @@ Please use the following prompt template (save the following dict as a JSON file
 We use the default generation hyper-parameters as identified in [this line](https://github.com/tloen/alpaca-lora/blob/main/generate.py#L90).
 Besides, be aware of the following hyper-parameters:
-- `eval_batch_size == 1`. **Using batched inference (eval_batch_size > 1) will result in a weird performance**.
 - `max_input_len == 1024`. This is the max_input_len of training. But it's fine to use any length in the inference since our evaluation batch size is 1.
 - `num_beams == 1`. In our experiments, we set beam size to 1. But we recommend you try with a larger beam size to get better responses from models.
 ## Zero-Shot Evaluation Performances

 We use the default generation hyper-parameters as identified in [this line](https://github.com/tloen/alpaca-lora/blob/main/generate.py#L90).
 Besides, be aware of the following hyper-parameters:
 - `max_input_len == 1024`. This is the max_input_len of training. But it's fine to use any length in the inference since our evaluation batch size is 1.
 - `num_beams == 1`. In our experiments, we set beam size to 1. But we recommend you try with a larger beam size to get better responses from models.
+- When doing batched inference, please make sure `tokenizer.padding_side = "left" `, as we left padded all the batched instances when doing tuning (though it shall not have a big impact on the inference results).
 ## Zero-Shot Evaluation Performances