File size: 2,378 Bytes
ef4d689
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# ์ด๋ฏธ์ง€ ๋ฐ๊ธฐ ์กฐ์ ˆํ•˜๊ธฐ

Stable Diffusion ํŒŒ์ดํ”„๋ผ์ธ์€ [์ผ๋ฐ˜์ ์ธ ๋””ํ“จ์ „ ๋…ธ์ด์ฆˆ ์Šค์ผ€์ค„๊ณผ ์ƒ˜ํ”Œ ๋‹จ๊ณ„์— ๊ฒฐํ•จ์ด ์žˆ์Œ](https://huggingface.co/papers/2305.08891) ๋…ผ๋ฌธ์—์„œ ์„ค๋ช…ํ•œ ๊ฒƒ์ฒ˜๋Ÿผ ๋งค์šฐ ๋ฐ๊ฑฐ๋‚˜ ์–ด๋‘์šด ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ๋Š” ์„ฑ๋Šฅ์ด ํ‰๋ฒ”ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ ์†”๋ฃจ์…˜์€ ํ˜„์žฌ [`DDIMScheduler`]์— ๊ตฌํ˜„๋˜์–ด ์žˆ์œผ๋ฉฐ ์ด๋ฏธ์ง€์˜ ๋ฐ๊ธฐ๋ฅผ ๊ฐœ์„ ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

<Tip>

๐Ÿ’ก ์ œ์•ˆ๋œ ์†”๋ฃจ์…˜์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์œ„์— ๋งํฌ๋œ ๋…ผ๋ฌธ์„ ์ฐธ๊ณ ํ•˜์„ธ์š”!

</Tip>

ํ•ด๊ฒฐ์ฑ… ์ค‘ ํ•˜๋‚˜๋Š” *v ์˜ˆ์ธก๊ฐ’*๊ณผ *v ๋กœ์Šค*๋กœ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‹ค์Œ flag๋ฅผ [`train_text_to_image.py`](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image.py) ๋˜๋Š” [`train_text_to_image_lora.py`](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_lora.py) ์Šคํฌ๋ฆฝํŠธ์— ์ถ”๊ฐ€ํ•˜์—ฌ `v_prediction`์„ ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค:

```bash
--prediction_type="v_prediction"
```

์˜ˆ๋ฅผ ๋“ค์–ด, `v_prediction`์œผ๋กœ ๋ฏธ์„ธ ์กฐ์ •๋œ [`ptx0/pseudo-journey-v2`](https://huggingface.co/ptx0/pseudo-journey-v2) ์ฒดํฌํฌ์ธํŠธ๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋‹ค์Œ์œผ๋กœ [`DDIMScheduler`]์—์„œ ๋‹ค์Œ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค:

1. rescale_betas_zero_snr=True`, ๋…ธ์ด์ฆˆ ์Šค์ผ€์ค„์„ ์ œ๋กœ ํ„ฐ๋ฏธ๋„ ์‹ ํ˜ธ ๋Œ€ ์žก์Œ๋น„(SNR)๋กœ ์žฌ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
2. `timestep_spacing="trailing"`, ๋งˆ์ง€๋ง‰ ํƒ€์ž„์Šคํ…๋ถ€ํ„ฐ ์ƒ˜ํ”Œ๋ง ์‹œ์ž‘

```py
>>> from diffusers import DiffusionPipeline, DDIMScheduler

>>> pipeline = DiffusionPipeline.from_pretrained("ptx0/pseudo-journey-v2")
# switch the scheduler in the pipeline to use the DDIMScheduler

>>> pipeline.scheduler = DDIMScheduler.from_config(
...     pipeline.scheduler.config, rescale_betas_zero_snr=True, timestep_spacing="trailing"
... )
>>> pipeline.to("cuda")
```

๋งˆ์ง€๋ง‰์œผ๋กœ ํŒŒ์ดํ”„๋ผ์ธ์— ๋Œ€ํ•œ ํ˜ธ์ถœ์—์„œ `guidance_rescale`์„ ์„ค์ •ํ•˜์—ฌ ๊ณผ๋‹ค ๋…ธ์ถœ์„ ๋ฐฉ์ง€ํ•ฉ๋‹ˆ๋‹ค:

```py
prompt = "A lion in galaxies, spirals, nebulae, stars, smoke, iridescent, intricate detail, octane render, 8k"
image = pipeline(prompt, guidance_rescale=0.7).images[0]
```

<div class="flex justify-center">
    <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/zero_snr.png"/>
</div>