Why is "use_cache" disabled by default in the generation_config.json?

#10

by jdpressman - opened Feb 4

Feb 4

{
  "_from_model_config": true,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "transformers_version": "4.35.2",
  "use_cache": false
}```

This confused me as a new user because my inference was suddenly getting slower with each token outputted. Is there a specific reason why it's disabled?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment