Papers
arxiv:2409.08248

TextBoost: Towards One-Shot Personalization of Text-to-Image Models via Fine-tuning Text Encoder

Published on Sep 12
· Submitted by akhaliq on Sep 13
Authors:

Abstract

Recent breakthroughs in text-to-image models have opened up promising research avenues in personalized image generation, enabling users to create diverse images of a specific subject using natural language prompts. However, existing methods often suffer from performance degradation when given only a single reference image. They tend to overfit the input, producing highly similar outputs regardless of the text prompt. This paper addresses the challenge of one-shot personalization by mitigating overfitting, enabling the creation of controllable images through text prompts. Specifically, we propose a selective fine-tuning strategy that focuses on the text encoder. Furthermore, we introduce three key techniques to enhance personalization performance: (1) augmentation tokens to encourage feature disentanglement and alleviate overfitting, (2) a knowledge-preservation loss to reduce language drift and promote generalizability across diverse prompts, and (3) SNR-weighted sampling for efficient training. Extensive experiments demonstrate that our approach efficiently generates high-quality, diverse images using only a single reference image while significantly reducing memory and storage requirements.

Community

Paper submitter

Very interesting job. Does the training process involve different data augmentation methods? If so, do they correspond to different enhanced pseudo-words A*?

·
Paper author

Thank you for your interest in our work :) We indeed applied various types of augmentations, including a range of geometric and color transformations (Figure 10). As you kindly mentioned, each augmentation corresponds to a specific A*. For more details on the technical implementation, please feel free to visit our GitHub repository!

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2409.08248 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2409.08248 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2409.08248 in a Space README.md to link it from this page.

Collections including this paper 1