DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation - Faces

Creators: Gwanghyun Kim, Taesung Kwon, Jong Chul Ye Paper: https://arxiv.org/abs/2110.02711

Excerpt from DiffusionCLIP paper showcasing comparison of DiffusionCLIP versus other methods for image reconstruction, manipulation, and style transfer.

DiffusionCLIP is a diffusion model which is well suited for image manipulation thanks to its nearly perfect inversion capability, which is an important advantage over GAN-based models. This checkpoint was trained on the CelebA-HQ Dataset, available on the Hugging Face Hub: https://huggingface.co/datasets/huggan/CelebA-HQ.

This checkpoint is most appropriate for manipulation, reconstruction, and style transfer on images of human faces using the DiffusionCLIP model. To use ID loss for preserving Human face identity, you are required to download the pretrained IR-SE50 model from TreB1eN. Additional information is available on the GitHub repository.

Credits

Code repository available at: https://github.com/gwang-kim/DiffusionCLIP

Citation

@article{kim2021diffusionclip,
  title={Diffusionclip: Text-guided image manipulation using diffusion models},
  author={Kim, Gwanghyun and Ye, Jong Chul},
  journal={arXiv preprint arXiv:2110.02711},
  year={2021}
}