Upamanyu098's picture
End of training
ef4d689 verified
|
raw
history blame
5.6 kB

ํ”„๋กฌํ”„ํŠธ์— ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌํ•˜๊ธฐ

[[open-in-colab]]

ํ…์ŠคํŠธ ๊ฐ€์ด๋“œ ๊ธฐ๋ฐ˜์˜ diffusion ๋ชจ๋ธ์€ ์ฃผ์–ด์ง„ ํ…์ŠคํŠธ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ํ…์ŠคํŠธ ํ”„๋กฌํ”„ํŠธ์—๋Š” ๋ชจ๋ธ์ด ์ƒ์„ฑํ•ด์•ผ ํ•˜๋Š” ์—ฌ๋Ÿฌ ๊ฐœ๋…์ด ํฌํ•จ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ ํ”„๋กฌํ”„ํŠธ์˜ ํŠน์ • ๋ถ€๋ถ„์— ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜๋Š” ๊ฒƒ์ด ๋ฐ”๋žŒ์งํ•œ ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค.

Diffusion ๋ชจ๋ธ์€ ๋ฌธ๋งฅํ™”๋œ ํ…์ŠคํŠธ ์ž„๋ฒ ๋”ฉ์œผ๋กœ diffusion ๋ชจ๋ธ์˜ cross attention ๋ ˆ์ด์–ด๋ฅผ ์กฐ์ ˆํ•จ์œผ๋กœ์จ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. (๋” ๋งŽ์€ ์ •๋ณด๋ฅผ ์œ„ํ•œ Stable Diffusion Guide๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”). ๋”ฐ๋ผ์„œ ํ”„๋กฌํ”„ํŠธ์˜ ํŠน์ • ๋ถ€๋ถ„์„ ๊ฐ•์กฐํ•˜๋Š”(๋˜๋Š” ๊ฐ•์กฐํ•˜์ง€ ์•Š๋Š”) ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•์€ ํ”„๋กฌํ”„ํŠธ์˜ ๊ด€๋ จ ๋ถ€๋ถ„์— ํ•ด๋‹นํ•˜๋Š” ํ…์ŠคํŠธ ์ž„๋ฒ ๋”ฉ ๋ฒกํ„ฐ์˜ ํฌ๊ธฐ๋ฅผ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ค„์ด๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ "ํ”„๋กฌํ”„ํŠธ ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌ" ๋ผ๊ณ  ํ•˜๋ฉฐ, ์ปค๋ฎค๋‹ˆํ‹ฐ์—์„œ ๊ฐ€์žฅ ์š”๊ตฌํ•˜๋Š” ๊ธฐ๋Šฅ์ž…๋‹ˆ๋‹ค.(์ด๊ณณ์˜ issue๋ฅผ ๋ณด์„ธ์š” ).

Diffusers์—์„œ ํ”„๋กฌํ”„ํŠธ ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌํ•˜๋Š” ๋ฐฉ๋ฒ•

์šฐ๋ฆฌ๋Š” diffusers์˜ ์—ญํ• ์ด ๋‹ค๋ฅธ ํ”„๋กœ์ ํŠธ๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” ํ•„์ˆ˜์ ์ธ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•˜๋Š” toolbex๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. InvokeAI ๋‚˜ diffuzers ๊ฐ™์€ ๊ฐ•๋ ฅํ•œ UI๋ฅผ ๊ตฌ์ถ•ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ”„๋กฌํ”„ํŠธ๋ฅผ ์กฐ์ž‘ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด, diffusers ๋Š” StableDiffusionPipeline์™€ ๊ฐ™์€ ๋งŽ์€ ํŒŒ์ดํ”„๋ผ์ธ์— prompt_embeds ์ธ์ˆ˜๋ฅผ ๋…ธ์ถœ์‹œ์ผœ, "prompt-weighted"/์ถ•์ฒ™๋œ ํ…์ŠคํŠธ ์ž„๋ฒ ๋”ฉ์„ ํŒŒ์ดํ”„๋ผ์ธ์— ๋ฐ”๋กœ ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

Compel ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” ํ”„๋กฌํ”„ํŠธ์˜ ์ผ๋ถ€๋ฅผ ๊ฐ•์กฐํ•˜๊ฑฐ๋‚˜ ๊ฐ•์กฐํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋Š” ์‰ฌ์šด ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ž„๋ฒ ๋”ฉ์„ ์ง์ ‘ ์ค€๋น„ํ•˜๋Š” ๊ฒƒ ๋Œ€์‹  ์ด ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์„ ๊ฐ•๋ ฅํžˆ ์ถ”์ฒœํ•ฉ๋‹ˆ๋‹ค.

๊ฐ„๋‹จํ•œ ์˜ˆ์ œ๋ฅผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ๊ณผ ๊ฐ™์ด "๊ณต์„ ๊ฐ–๊ณ  ๋…ธ๋Š” ๋ถ‰์€์ƒ‰ ๊ณ ์–‘์ด" ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค:

from diffusers import StableDiffusionPipeline, UniPCMultistepScheduler

pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)

prompt = "a red cat playing with a ball"

generator = torch.Generator(device="cpu").manual_seed(33)

image = pipe(prompt, generator=generator, num_inference_steps=20).images[0]
image

์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€:

img

์‚ฌ์ง„์—์„œ ์•Œ ์ˆ˜ ์žˆ๋“ฏ์ด, "๊ณต"์€ ์ด๋ฏธ์ง€์— ์—†์Šต๋‹ˆ๋‹ค. ์ด ๋ถ€๋ถ„์„ ๊ฐ•์กฐํ•ด ๋ณผ๊นŒ์š”!

๋จผ์ € compel ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์„ค์น˜ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค:

pip install compel

๊ทธ๋Ÿฐ ๋‹ค์Œ์—๋Š” Compel ์˜ค๋ธŒ์ ํŠธ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:

from compel import Compel

compel_proc = Compel(tokenizer=pipe.tokenizer, text_encoder=pipe.text_encoder)

์ด์ œ "++" ๋ฅผ ์‚ฌ์šฉํ•ด์„œ "๊ณต" ์„ ๊ฐ•์กฐํ•ด ๋ด…์‹œ๋‹ค:

prompt = "a red cat playing with a ball++"

๊ทธ๋ฆฌ๊ณ  ์ด ํ”„๋กฌํ”„ํŠธ๋ฅผ ํŒŒ์ดํ”„๋ผ์ธ์— ๋ฐ”๋กœ ์ „๋‹ฌํ•˜์ง€ ์•Š๊ณ , compel_proc ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ฒ˜๋ฆฌํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค:

prompt_embeds = compel_proc(prompt)

ํŒŒ์ดํ”„๋ผ์ธ์— prompt_embeds ๋ฅผ ๋ฐ”๋กœ ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

generator = torch.Generator(device="cpu").manual_seed(33)

images = pipe(prompt_embeds=prompt_embeds, generator=generator, num_inference_steps=20).images[0]
image

์ด์ œ "๊ณต"์ด ์žˆ๋Š” ๊ทธ๋ฆผ์„ ์ถœ๋ ฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!

img

๋งˆ์ฐฌ๊ฐ€์ง€๋กœ -- ์ ‘๋ฏธ์‚ฌ๋ฅผ ๋‹จ์–ด์— ์‚ฌ์šฉํ•˜์—ฌ ๋ฌธ์žฅ์˜ ์ผ๋ถ€๋ฅผ ๊ฐ•์กฐํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•œ๋ฒˆ ์‹œ๋„ํ•ด ๋ณด์„ธ์š”!

์ฆ๊ฒจ์ฐพ๋Š” ํŒŒ์ดํ”„๋ผ์ธ์— prompt_embeds ์ž…๋ ฅ์ด ์—†๋Š” ๊ฒฝ์šฐ issue๋ฅผ ์ƒˆ๋กœ ๋งŒ๋“ค์–ด์ฃผ์„ธ์š”. Diffusers ํŒ€์€ ์ตœ๋Œ€ํ•œ ๋Œ€์‘ํ•˜๋ ค๊ณ  ๋…ธ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

Compel 1.1.6 ๋Š” textual inversions์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹จ์ˆœํ™”ํ•˜๋Š” ์œ ํ‹ฐ๋ฆดํ‹ฐ ํด๋ž˜์Šค๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. DiffusersTextualInversionManager๋ฅผ ์ธ์Šคํ„ด์Šคํ™” ํ•œ ํ›„ ์ด๋ฅผ Compel init์— ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค:

textual_inversion_manager = DiffusersTextualInversionManager(pipe)
compel = Compel(
    tokenizer=pipe.tokenizer,
    text_encoder=pipe.text_encoder,
    textual_inversion_manager=textual_inversion_manager)

๋” ๋งŽ์€ ์ •๋ณด๋ฅผ ์–ป๊ณ  ์‹ถ๋‹ค๋ฉด compel ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.