RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
RuntimeError Traceback (most recent call last)
in <cell line: 9>()
7 tokenizer.padding_side = "left"
8
----> 9 res, context, _ = model.chat(
10 image=image,
11 msgs=msgs,
5 frames
/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py in sample(self, input_ids, logits_processor, stopping_criteria, logits_warper, max_length, pad_token_id, eos_token_id, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, synced_gpus, streamer, **model_kwargs)
2648 # sample
2649 probs = nn.functional.softmax(next_token_scores, dim=-1)
-> 2650 next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
2651
2652 # finished sentences should have their next token be a padding token
RuntimeError: probability tensor contains either inf
, nan
or element < 0
I tried the provided example in Colab using a T4 and choose the model to use float16 (instead of bfloat16). Any idea how to solve this issue?
Please upgrade torch version to 2.1.2 or above which support the SDPA implementation of attention