tokenizer.model_max_length for llama-2-7b-chat-hf
Thanks for this model.
when printing 'tokenizer.model_max_length', I got a number like '1000000000000000019884624838656'.
the model_max_length is supposed to be 4k? Not sure where this behavior stems from.
Thanks
That number is actually correct, because we solved long context.
.
.
.
.
No j/k, idk either, have you tried the same command with meta's version of the llama-2 weights?
Hello, I´m using this model, but since yesterday, when I run it, I´m getting this error. Running on 4090.
Traceback (most recent call last):
File "/root/endpoint.py", line 43, in chat
response = miner.forward(messages, num_replies = n)
File "/root/endpoint.py", line 106, in forward
output = self.model.generate(
File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/generation/utils.py", line 1485, in generate
return self.sample(
File "/opt/conda/lib/python3.10/site-packages/transformers/generation/utils.py", line 2560, in sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either inf
, nan
or element < 0
Thank you for your help in advance.