--- license: mit language: - en library_name: diffusers --- # Arc2Face Model Card
[**Project Page**](https://arc2face.github.io/) **|** [**Paper (ArXiv)**](https://arxiv.org/abs/2403.11641) **|** [**Code**](https://github.com/foivospar/Arc2Face) **|** [🤗 **Gradio demo**](https://huggingface.co/spaces/FoivosPar/Arc2Face)
## Introduction Arc2Face is an ID-conditioned face model, that can generate diverse, ID-consistent photos of a person given only its ArcFace ID-embedding. It is trained on a restored version of the WebFace42M face recognition database, and is further fine-tuned on FFHQ and CelebA-HQ.
## Model Details It consists of 2 components: - encoder, a finetuned CLIP ViT-L/14 model - arc2face, a finetuned UNet model both of which are fine-tuned from [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5). The encoder is tailored for projecting ID-embeddings to the CLIP latent space. Arc2Face adapts the pre-trained backbone to the task of ID-to-face generation, conditioned solely on ID vectors. ## ControlNet (pose) We also provide a ControlNet model trained on top of Arc2Face for pose control.
## Usage The models can be downloaded directly from this repository or using python: ```python from huggingface_hub import hf_hub_download hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="arc2face/config.json", local_dir="./models") hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="arc2face/diffusion_pytorch_model.safetensors", local_dir="./models") hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="encoder/config.json", local_dir="./models") hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="encoder/pytorch_model.bin", local_dir="./models") hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="controlnet/config.json", local_dir="./models") hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="controlnet/diffusion_pytorch_model.safetensors", local_dir="./models") ``` Please check our [GitHub repository](https://github.com/foivospar/Arc2Face) for complete inference instructions. ## Limitations and Bias - Only one person per image can be generated. - Poses are constrained to the frontal hemisphere, similar to FFHQ images. - The model may reflect the biases of the training data or the ID encoder. ## Citation **BibTeX:** ```bibtex @inproceedings{paraperas2024arc2face, title={Arc2Face: A Foundation Model for ID-Consistent Human Faces}, author={Paraperas Papantoniou, Foivos and Lattas, Alexandros and Moschoglou, Stylianos and Deng, Jiankang and Kainz, Bernhard and Zafeiriou, Stefanos}, booktitle={Proceedings of the European Conference on Computer Vision (ECCV)}, year={2024} } ```