Re-Quantize Model
Hi, thank you for your work.
Would you be willing to update the model to support the latest QuiP# changes? I know you opened an issue here https://github.com/Cornell-RelaxML/quip-sharp/issues/31 . You and Minami-su are the only ones I've found who have made QuiP# quantizations so far.
It's funny the thought occured to me this morning too: I intuitively assumed I'd have to redo the hessians, which takes ages. Perhaps I only have to redo the latter two steps. That takes less than a day. I'll try that once my GPU is free again.
Thanks! Yeah I had checked that issue a few days ago and the dude mentioned not having to redo the hessians, so that's great news.
@igoforth I'm currently quantizing a 70b model, that takes longer than I thought. Maybe that will take 2 more days and I'll be busy with new year, too.
80 layers has been running for a day already
I finished llama 70b and uploading that now with the newest library version. I'm doing the same now with this 34b model. give it a day or so @igoforth
Updated the model.
https://huggingface.co/KnutJaegersberg/Tess-M-34B-2bit