FID Score

#10
by ori-m-2022a - opened

Calculating the FID score of the model on 50k generated images gives: 12.437098284576962.
This FID score is much higher than what reported on the original paper (3.17).

Hi @study-work-2022a, could you please share the script you've used to calculate FID? I'd like to reproduce your result to debug this.

I used this code to calculate the FID: https://github.com/mseitzer/pytorch-fid.
To generate the images I used code that similar to the provided example.

Any update on this? I also observe the same FID.

Any update for this? I trained improved-ddpm code from https://github.com/openai/improved-diffusion and check FID score which is much worse than reported in the paper... :(

Hey @TurtleNeck ,

Diffusers DDPM is an exact clone of: https://github.com/pesser/pytorch_diffusion

which is the PyTorch implementation of orig TF Version

Can you check whether you get better results with presser/pytorch_diffusion?

Also note that this is the non ema checkpoint, maybe they use the ema checkpoint in their paper for FID evaluation.

Is there an ema checkpoint for this model?

Hey guys. We converted the official EMA checkpoint to Diffusers. Hope this will be helpful: https://github.com/VainF/Diff-Pruning/releases/download/v0.0.1/ddpm_ema_cifar10.zip

A bash script for converting is available at https://github.com/VainF/Diff-Pruning/blob/main/tools/convert_cifar10_ddpm_ema.sh

This checkpoint achieves an FID score of 4.5 with 100 DDIM steps.

I use the ema checkpoint, but I get an FID score of 5.23 in 1000 DDPM steps. can you give me a link to you FID scripts?

I use the ema checkpoint, but I get an FID score of 5.23 in 1000 DDPM steps. can you give me a link to you FID scripts?

Hi @speiqin , I modified the original diffusers codebase to fix this issue. You can try this script to reproduce my results (50k samples, FID=4.5, 100-step DDIM). But I think DDPM sampling is not affected by this issue. How many samples did you use for FID estimation?

Hi, @Vinnnf . I get the FID result which close to ddpm paper. the most importance error in my experiment( FID only 5.23) is that I only use 1w images to calculate the FID score. Now, I use 5w is get the right result. Thanks for your checkpoint.

@ori-m-2022a Hello. Could you tell me the finally FID score of 50k samples of the checkpoints "google/ddpm-cifar10-32" when T = 1000? I got 12.9786970050562 I don't know whether it 's correct. Thank you very much!

hi, @Vinnnf . I found the dropout rate in converted checkpoint is 0.0, but the offical model is setting 0.1 in the training. This is a wrong?

Hi all, I am sampling 10k examples with DDPM(T=1000) and 50k DDIM(T=100) and 10k PNDM(T=100). The best FID score I can get is just around 36. I am computing the sampled image in respect to the images from torchvision cifar10 dataset saved in a directory. The script that I use to compute FID is https://github.com/mseitzer/pytorch-fid . I am very new to FID and I am not too sure which part did I made some mistakes. I really appreciate any help and advices.

Cheers

Hi all, I am sampling 10k examples with DDPM(T=1000) and 50k DDIM(T=100) and 10k PNDM(T=100). The best FID score I can get is just around 36. I am computing the sampled image in respect to the images from torchvision cifar10 dataset saved in a directory. The script that I use to compute FID is https://github.com/mseitzer/pytorch-fid . I am very new to FID and I am not too sure which part did I made some mistakes. I really appreciate any help and advices.

Cheers

same process, how did you solve this? many thanks

Any one have solved this problem? I generated 50K images using 100 steps DDIMSampler, and I got a FID of 11.214451002914018 which is far away from 3.17....

Sign up or log in to comment