How to fix "RuntimeError: expected scalar type Half but found Float" when using fp16

Replace line 272-273 in <pythondistr>\Lib\site-packages\torch\nn\modules\

        return F.group_norm(
            input, self.num_groups, self.weight, self.bias, self.eps)


        return F.group_norm(
            input, self.num_groups, self.weight.type(input.dtype), self.bias.type(input.dtype), self.eps)

In <pythondistr>\Lib\site-packages\diffusers\pipelines\stable_diffusion\ after this section (around line 102-107)

        latents = torch.randn(
            (batch_size, self.unet.in_channels, height // 8, width // 8),


        latents = latents.half()

Finally, In <pythondistr>\Lib\site-packages\diffusers\pipelines\stable_diffusion\ replace this (on line 160-161)

       safety_cheker_input = self.feature_extractor(self.numpy_to_pil(image), return_tensors="pt").to(self.device)
       image, has_nsfw_concept = self.safety_checker(images=image, clip_input=safety_cheker_input.pixel_values)


        safety_cheker_input = self.feature_extractor(self.numpy_to_pil(image), return_tensors="pt").to(self.device)
        safety_cheker_input.pixel_values = safety_cheker_input.pixel_values.half()
        image, has_nsfw_concept = self.safety_checker(images=image, clip_input=safety_cheker_input.pixel_values)

Hey @TessaCoil ,

Thanks for the fix here! Does it happen when loading weights in torch.float16?

Could you maybe post a code snippet that currently leads to an error/bug? :-)

It happens when you try to switch to cpu I think in one instance - likely the self hosted - that I have seen bemoaned "in the wild". @patrickvonplaten rather than included GPU driven default selection. As I understood it.

setting it to CPU then complains about no support for halfs or vice versa. this looks like a fix for that. First glance.

Here is a code snippet that causes the error.

import torch
from diffusers import StableDiffusionPipeline

TOKEN = 'hugging_face_token'

# get your token at

def run():
    pipe = StableDiffusionPipeline.from_pretrained(

    prompt = "a photo of an astronaut riding a horse on mars"
    image = pipe(prompt)["sample"][0]"astronaut_rides_horse.png")

# Press the green button in the gutter to run the script.
if __name__ == '__main__':

@TessaCoil - I get the same error around line 82 (<pythondistr>\Lib\site-packages\diffusers\pipelines\stable_diffusion\

text_embeddings = self.text_encoder([0]

so none of the upcoming modifications are reached. What do you reckon should I change?

Edit: I forgot to wrap pipe(prompt)["sample"][0] around autocast("cuda").

hey i'm facing the similar issue for 'cpu' device... in
as no gpu - 'Cuda' available.

if i set torch_dtype=torch.float16,
thn it throws
RuntimeError: expected scalar type Float but found BFloat16

if i set torch_dtype=torch.bfloat16,
thn it throws
RuntimeError: expected scalar type BFloat16 but found Float,

if i set torch_dtype=torch.half,
thn it throws
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

if i set torch_dtype=torch.double,
thn it throws
RuntimeError: expected scalar type BFloat16 but found Double

if i set torch_dtype=torch.long,
thn it throws
raise TypeError(' only accepts floating point or complex '
TypeError: only accepts floating point or complex dtypes, but got desired dtype=torch.int64

so i am really confused on what torch_dtype to use for successful run.

I came across the same error. I am also using diffusers 1.4. I added the with torch.autocast("cuda"): line above the pipe(prompt, latents=latents) and problem solved.

thanks, i also add with torch.autocast("cuda"): , works for me

Hi! I Have a problem : Input type (float) and bias type (struct c10::Half) should be the same
the error is in \Lib\site-packages\torch\nn\modules\
File "C:\Users\user\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\", line 463, in forward
return self._conv_forward(input, self.weight, self.bias) File "C:\Users\user\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same

do u know what I can do here?

CompVis org

Hi @MohammadMi ! As others mentioned, this usually happens when attempting to run the model in half precision on CPU, because CPU does not support half floats. Do you have a GPU in your computer, and are you trying to use it? Do you have a code snippet that demonstrates the problem?

Yes I have a GPU - GTX 1060Ti
I didn’t change any settings..
Do u need to see my webui-user ? I set —no-half —lowvram —opt-slipt-attention..
Which code snippet do u mean?

CompVis org

I think that card doesn't properly support half float, unfortunately. See here for details about a similar card and some tricks to make it work using the diffusers library.

I met the problem when using concurrent.futures for multi-thread inferencing, I cannot solve the bug yet.
But when setting num_workers = 1, everything goes fine.

