Limited (truncated) response with inference API
#23
by
RobertTaylor
- opened
I am getting a limited output when using the inference API. This is also the case within the on-page example. Is HF's rate-limiting per-token?
I am getting a limited output when using the inference API. This is also the case within the on-page example. Is HF's rate-limiting per-token?
Hi there, did you set max_new_tokens ?
Gosh, thanks for that. Sorry, I'm an idiot.
what are the other parameters?
what are the other parameters?
Hi, Maybe this can help : https://huggingface.co/docs/api-inference/detailed_parameters#text-generation-task