what is the input token length of Falcon-40B and -7B models?
#38
by
sermolin
- opened
couldn't find it in the documentation, reference notebook hardcodes it to 1024 mentioning the need to set int8 if the input length is >1024, but what's the max?
use-case: document summarization and text generation. Probably would not want to use --Instruct model for that, right?
Someone for That one ?
2048
Tried to increase the number of tokens in the openapi.json (cloned the repo and found that simply by searching for 1024) but that didn't help. Created a feature request: https://github.com/huggingface/text-generation-inference/issues/593. Please add to that if you need any adaptations.