Re-Quantize Model

by igoforth - opened Dec 30, 2023

Dec 30, 2023

Hi, thank you for your work.

Would you be willing to update the model to support the latest QuiP# changes? I know you opened an issue here https://github.com/Cornell-RelaxML/quip-sharp/issues/31 . You and Minami-su are the only ones I've found who have made QuiP# quantizations so far.

KnutJaegersberg

Owner Dec 30, 2023

It's funny the thought occured to me this morning too: I intuitively assumed I'd have to redo the hessians, which takes ages. Perhaps I only have to redo the latter two steps. That takes less than a day. I'll try that once my GPU is free again.

igoforth

Dec 30, 2023

Thanks! Yeah I had checked that issue a few days ago and the dude mentioned not having to redo the hessians, so that's great news.

Minami-su

Dec 31, 2023

•

edited Dec 31, 2023

It's still available https://github.com/Cornell-RelaxML/quip-sharp/tree/release20231203

KnutJaegersberg

Owner Dec 31, 2023

@igoforth I'm currently quantizing a 70b model, that takes longer than I thought. Maybe that will take 2 more days and I'll be busy with new year, too.

KnutJaegersberg

Owner Dec 31, 2023

80 layers has been running for a day already

KnutJaegersberg

Owner Jan 1

I finished llama 70b and uploading that now with the newest library version. I'm doing the same now with this 34b model. give it a day or so @igoforth

KnutJaegersberg

Owner Jan 3

Updated the model.
https://huggingface.co/KnutJaegersberg/Tess-M-34B-2bit

KnutJaegersberg changed discussion status to closed Jan 3

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment