Incomplete Output even with max_new_tokens
#37
by
Pradeep1995
- opened
So the output of my finetuned openchat model ends abruptly and I ideally want it to complete the paragraph/sentences/code which it was it between of.
Although I have provided max_new_tokens = 300 and also in prompt I give to limit by 300 words.
The response is always big and ends abruptly. Any way I can ask for a complete output within desired number of output tokens?
The generation suddenly stops as soon as it reaches the specified number of max_new_tokens reached, without checking whether the sentence is completed or not.