Half precision

#1
by ljhwild - opened

Is this compatible out of the box with half precision or quantizations as opposed to unbables library implementation?

as you can see the model size is 7GB which for a 3.5G params is FP16.
but you can achive the same with the Unbabel model by changing two lines of code.

vince62s changed discussion status to closed

Sign up or log in to comment