code for use this quantized model

#18
by Mahdimohseni0333 - opened

hi, can anybody help me with code for running this quantized model ?

As far as I know, this is the main repo where the development of stable diffusion for gguf is happening but I did not see any PR related to flux yet. So might be only be available for the ComfyUI version. There are only 2 bindings one in go and the other in C#. or you can use the CLI I am interested in creating one for Python but don't have time.

stable-diffusion.cpp supports quantised gguf models now. The sample code is provided in their github repo itself here

Note that if you are using linux with CUDA , their releases file only contain an executable for RAM mode only and one time compilation is needed for getting the executable which uses the GPU.

I forgot to reply to this, but the code/instructions for the ComfyUI specific quantization code is here and uses a custom patch for the base llama.cpp repo to create the quantized files.

As far as I know, the files created with stable-diffusion.cpp are compatible as well.

Sign up or log in to comment