Update README.md
Browse files
README.md
CHANGED
@@ -52,10 +52,9 @@ Please use the following prompt template (save the following dict as a JSON file
|
|
52 |
We use the default generation hyper-parameters as identified in [this line](https://github.com/tloen/alpaca-lora/blob/main/generate.py#L90).
|
53 |
|
54 |
Besides, be aware of the following hyper-parameters:
|
55 |
-
- `eval_batch_size == 1`. **Using batched inference (eval_batch_size > 1) will result in a weird performance**.
|
56 |
- `max_input_len == 1024`. This is the max_input_len of training. But it's fine to use any length in the inference since our evaluation batch size is 1.
|
57 |
- `num_beams == 1`. In our experiments, we set beam size to 1. But we recommend you try with a larger beam size to get better responses from models.
|
58 |
-
|
59 |
|
60 |
## Zero-Shot Evaluation Performances
|
61 |
|
|
|
52 |
We use the default generation hyper-parameters as identified in [this line](https://github.com/tloen/alpaca-lora/blob/main/generate.py#L90).
|
53 |
|
54 |
Besides, be aware of the following hyper-parameters:
|
|
|
55 |
- `max_input_len == 1024`. This is the max_input_len of training. But it's fine to use any length in the inference since our evaluation batch size is 1.
|
56 |
- `num_beams == 1`. In our experiments, we set beam size to 1. But we recommend you try with a larger beam size to get better responses from models.
|
57 |
+
- When doing batched inference, please make sure `tokenizer.padding_side = "left" `, as we left padded all the batched instances when doing tuning (though it shall not have a big impact on the inference results).
|
58 |
|
59 |
## Zero-Shot Evaluation Performances
|
60 |
|