[Don't merge] inferentia2 workaround
#34
by
philschmid
HF staff
- opened
This is a workaround for deploying Llama 3 on Inferentia with TGI. Since the new generation_config
has now a list as eos_token_id. The deployment fails. This revision removes one of it.
philschmid
changed pull request title from
inferentia2 workaround
to [Don't merge] inferentia2 workaround