Limited (truncated) response with inference API

#23

by RobertTaylor - opened May 28

May 28

I am getting a limited output when using the inference API. This is also the case within the on-page example. Is HF's rate-limiting per-token?

Mistral AI_ org May 28

I am getting a limited output when using the inference API. This is also the case within the on-page example. Is HF's rate-limiting per-token?

Hi there, did you set max_new_tokens ?

May 28

Gosh, thanks for that. Sorry, I'm an idiot.

Jun 20

what are the other parameters?

Mistral AI_ org Jun 20

what are the other parameters?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment