Why is "use_cache" disabled by default in the generation_config.json?
#10
by
jdpressman
- opened
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"transformers_version": "4.35.2",
"use_cache": false
}```
This confused me as a new user because my inference was suddenly getting slower with each token outputted. Is there a specific reason why it's disabled?