Anonymize Anyone: Toward Race Fairness in Text-to-Face Synthesis using Simple Preference Optimization in Diffusion Model

For detailed information, code, and documentation, please visit our GitHub repository: Anonymize-Anyone

Anonymize Anyone

Model

Anonymize Anyone presents a novel approach to text-to-face synthesis using a Diffusion Model that considers Race Fairness. Our method uses facial segmentation masks to edit specific facial regions, and employs a Stable Diffusion v2 Inpainting model trained on a curated Asian dataset. We introduce two key losses: ℒ𝐹𝐹𝐸 (Focused Feature Enhancement Loss) to enhance performance with limited data, and ℒ𝑫𝑰𝑭𝑭 (Difference Loss) to address catastrophic forgetting. Finally, we apply Simple Preference Optimization (SimPO) for refined and enhanced image generation.

Model Checkpoints

Using with Diffusers🧨

You can use this model directly with the diffusers library:

import torch
from PIL import Image
from diffusers import StableDiffusionInpaintPipeline

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
sd_pipe = StableDiffusionInpaintPipeline.from_pretrained(
    "fh2c1/Anonymize-Anyone",
    torch_dtype=torch.float16,
    safety_checker=None,
).to(device)
sd_pipe.load_lora_weights("fh2c1/SimPO-LoRA", adapter_name="SimPO")
sd_pipe.set_adapters(["SimPO"], adapter_weights=[0.5])

def generate_image(image_path, mask_path, prompt, negative_prompt, pipe, seed):
    try:
        in_image = Image.open(image_path)
        in_mask = Image.open(mask_path)
    except IOError as e:
        print(f"Loading error: {e}")
        return None
    generator = torch.Generator(device).manual_seed(seed)
    result = pipe(image=in_image, mask_image=in_mask, prompt=prompt,
                  negative_prompt=negative_prompt, generator=generator)
    return result.images[0]

image = '/content/Anonymize-Anyone/data/2.png'
mask = "/content/Anonymize-Anyone/data/2_mask.png"
prompt = "he is an asian man."
seed = 38189219984105
negative_prompt = "low resolution, ugly, disfigured, ugly, bad, immature, cartoon, anime, 3d, painting, b&w, deformed eyes, low quailty, noise"

try:
    generated_image = generate_image(image_path=image, mask_path=mask, prompt=prompt,
                                     negative_prompt=negative_prompt, pipe=sd_pipe, seed=seed)
except TypeError as e:
    print(f"TypeError : {e}")

generated_image

For more detailed usage instructions, including how to prepare segmentation masks and run inference, please refer to our GitHub repository.

Training

For information on how to train the model, including the use of ℒ𝐹𝐹𝐸 (Focused Feature Enhancement Loss) and ℒ𝑫𝑰𝑭𝑭 (Difference Loss), please see our GitHub repository's training section.

fh2c1
/

SimPO-LoRA

Anonymize Anyone: Toward Race Fairness in Text-to-Face Synthesis using Simple Preference Optimization in Diffusion Model