Anyway to make it work in fp8 transformer?
#54
by
chuckma
- opened
import torch
from diffusers import FluxTransformer2DModel, FluxPipeline
from transformers import T5EncoderModel, CLIPTextModel
from optimum.quanto import freeze, qfloat8, quantize
bfl_repo = "black-forest-labs/FLUX.1-dev"
dtype = torch.bfloat16
transformer = FluxTransformer2DModel.from_single_file("https://huggingface.co/Kijai/flux-fp8/blob/main/flux1-dev-fp8.safetensors", torch_dtype=dtype)
quantize(transformer, weights=qfloat8)
freeze(transformer)
from diffusers import FluxPipeline
import torch
pipe = FluxPipeline.from_pretrained(bfl_repo,
transformer=transformer,
torch_dtype=dtype)
pipe.enable_model_cpu_offload()
repo_name = "ByteDance/Hyper-SD"
# Take 8-steps lora as an example
ckpt_8steps_name = "Hyper-FLUX.1-dev-8steps-lora.safetensors"
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_8steps_name),
adapter_name="default")
Not work if i quantize the transformer first. I want to switch to another lora even after quantize the transformer. Any advice?
hi, have you solved this problem?
I see. Newer diffusers version solves the problem.(maybe in 0.31)
chuckma
changed discussion status to
closed