Very Slow Generation on google colab
#1
by
delitante-coder
- opened
I am loading model in 4 bit, and also using bnd_compute_dtype = torch.bfloat16.
Is anyone else face the same issue ?
I am loading model in 4 bit, and also using bnd_compute_dtype = torch.bfloat16.
Is anyone else face the same issue ?