model optimization to decrease sizes of quants

#3
by softfluffyboy - opened

Hi, i try this model it works fine! Thanks again, i think about making backup of llama 3.1 8b model on DvD disk 4.7GB that cheap,common and long lasting backup storage i have disks that burn in 2008 ,md5 of files are ok, the problem is that Q4_K_S is 4.72GB, Q4_K_M quant is 4.92 GB and of course not fit on dvd, lower quants 4 _0 is 4.69gb almost fit in dvd, but q4_0 and below made model more dumb and loss of quality , there any methods to optimize model before turn in quants to decrease size of model and quants ??

softfluffyboy changed discussion title from decrease sizes of quants to model optimization to decrease sizes of quants

Sign up or log in to comment