Upamanyu098's picture
End of training
ef4d689 verified
|
raw
history blame
2.89 kB

Text-guided depth-to-image ์ƒ์„ฑ

[[open-in-colab]]

[StableDiffusionDepth2ImgPipeline]์„ ์‚ฌ์šฉํ•˜๋ฉด ํ…์ŠคํŠธ ํ”„๋กฌํ”„ํŠธ์™€ ์ดˆ๊ธฐ ์ด๋ฏธ์ง€๋ฅผ ์ „๋‹ฌํ•˜์—ฌ ์ƒˆ ์ด๋ฏธ์ง€์˜ ์ƒ์„ฑ์„ ์กฐ์ ˆํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์ด๋ฏธ์ง€ ๊ตฌ์กฐ๋ฅผ ๋ณด์กดํ•˜๊ธฐ ์œ„ํ•ด depth_map์„ ์ „๋‹ฌํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. depth_map์ด ์ œ๊ณต๋˜์ง€ ์•Š์œผ๋ฉด ํŒŒ์ดํ”„๋ผ์ธ์€ ํ†ตํ•ฉ๋œ depth-estimation model์„ ํ†ตํ•ด ์ž๋™์œผ๋กœ ๊นŠ์ด๋ฅผ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.

๋จผ์ € [StableDiffusionDepth2ImgPipeline]์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:

import torch
import requests
from PIL import Image

from diffusers import StableDiffusionDepth2ImgPipeline

pipe = StableDiffusionDepth2ImgPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-depth",
    torch_dtype=torch.float16,
).to("cuda")

์ด์ œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ํŒŒ์ดํ”„๋ผ์ธ์— ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค. ํŠน์ • ๋‹จ์–ด๊ฐ€ ์ด๋ฏธ์ง€ ์ƒ์„ฑ์„ ๊ฐ€์ด๋“œ ํ•˜๋Š”๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด negative_prompt๋ฅผ ์ „๋‹ฌํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค:

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
init_image = Image.open(requests.get(url, stream=True).raw)
prompt = "two tigers"
n_prompt = "bad, deformed, ugly, bad anatomy"
image = pipe(prompt=prompt, image=init_image, negative_prompt=n_prompt, strength=0.7).images[0]
image
Input Output

์•„๋ž˜์˜ Spaces๋ฅผ ๊ฐ€์ง€๊ณ  ๋†€๋ฉฐ depth map์ด ์žˆ๋Š” ์ด๋ฏธ์ง€์™€ ์—†๋Š” ์ด๋ฏธ์ง€์˜ ์ฐจ์ด๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์ธํ•ด ๋ณด์„ธ์š”!