End of training

3a25a0a verified 3 months ago

12.9 kB

	# Community Scripts

	Community scripts consist of inference examples using Diffusers pipelines that have been added by the community.
	Please have a look at the following table to get an overview of all community examples. Click on the Code Example to get a copy-and-paste code example that you can try out.
	If a community script doesn't work as expected, please open an issue and ping the author on it.

	\| Example \| Description \| Code Example \| Colab \| Author \|
	\|:--------------------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|--------------------------------------------------------------:\|
	\| Using IP-Adapter with negative noise \| Using negative noise with IP-adapter to better control the generation (see the [original post](https://github.com/huggingface/diffusers/discussions/7167) on the forum for more details) \| [IP-Adapter Negative Noise](#ip-adapter-negative-noise) \| \| [Álvaro Somoza](https://github.com/asomoza)\|
	\| asymmetric tiling \|configure seamless image tiling independently for the X and Y axes \| [Asymmetric Tiling](#asymmetric-tiling ) \| \| [alexisrolland](https://github.com/alexisrolland)\|


	## Example usages

	### IP Adapter Negative Noise

	Diffusers pipelines are fully integrated with IP-Adapter, which allows you to prompt the diffusion model with an image. However, it does not support negative image prompts (there is no `negative_ip_adapter_image` argument) the same way it supports negative text prompts. When you pass an `ip_adapter_image,` it will create a zero-filled tensor as a negative image. This script shows you how to create a negative noise from `ip_adapter_image` and use it to significantly improve the generation quality while preserving the composition of images.

	[cubiq](https://github.com/cubiq) initially developed this feature in his [repository](https://github.com/cubiq/ComfyUI_IPAdapter_plus). The community script was contributed by [asomoza](https://github.com/Somoza). You can find more details about this experimentation [this discussion](https://github.com/huggingface/diffusers/discussions/7167)

	IP-Adapter without negative noise
	\|source\|result\|
	\|---\|---\|
	\|![20240229150812](https://github.com/huggingface/diffusers/assets/5442875/901d8bd8-7a59-4fe7-bda1-a0e0d6c7dffd)\|![20240229163923_normal](https://github.com/huggingface/diffusers/assets/5442875/3432e25a-ece6-45f4-a3f4-fca354f40b5b)\|

	IP-Adapter with negative noise
	\|source\|result\|
	\|---\|---\|
	\|![20240229150812](https://github.com/huggingface/diffusers/assets/5442875/901d8bd8-7a59-4fe7-bda1-a0e0d6c7dffd)\|![20240229163923](https://github.com/huggingface/diffusers/assets/5442875/736fd15a-36ba-40c0-a7d8-6ec1ac26f788)\|

	```python
	import torch

	from diffusers import AutoencoderKL, DPMSolverMultistepScheduler, StableDiffusionXLPipeline
	from diffusers.models import ImageProjection
	from diffusers.utils import load_image


	def encode_image(
	image_encoder,
	feature_extractor,
	image,
	device,
	num_images_per_prompt,
	output_hidden_states=None,
	negative_image=None,
	):
	dtype = next(image_encoder.parameters()).dtype

	if not isinstance(image, torch.Tensor):
	image = feature_extractor(image, return_tensors="pt").pixel_values

	image = image.to(device=device, dtype=dtype)
	if output_hidden_states:
	image_enc_hidden_states = image_encoder(image, output_hidden_states=True).hidden_states[-2]
	image_enc_hidden_states = image_enc_hidden_states.repeat_interleave(num_images_per_prompt, dim=0)

	if negative_image is None:
	uncond_image_enc_hidden_states = image_encoder(
	torch.zeros_like(image), output_hidden_states=True
	).hidden_states[-2]
	else:
	if not isinstance(negative_image, torch.Tensor):
	negative_image = feature_extractor(negative_image, return_tensors="pt").pixel_values
	negative_image = negative_image.to(device=device, dtype=dtype)
	uncond_image_enc_hidden_states = image_encoder(negative_image, output_hidden_states=True).hidden_states[-2]

	uncond_image_enc_hidden_states = uncond_image_enc_hidden_states.repeat_interleave(num_images_per_prompt, dim=0)
	return image_enc_hidden_states, uncond_image_enc_hidden_states
	else:
	image_embeds = image_encoder(image).image_embeds
	image_embeds = image_embeds.repeat_interleave(num_images_per_prompt, dim=0)
	uncond_image_embeds = torch.zeros_like(image_embeds)

	return image_embeds, uncond_image_embeds


	@torch.no_grad()
	def prepare_ip_adapter_image_embeds(
	unet,
	image_encoder,
	feature_extractor,
	ip_adapter_image,
	do_classifier_free_guidance,
	device,
	num_images_per_prompt,
	ip_adapter_negative_image=None,
	):
	if not isinstance(ip_adapter_image, list):
	ip_adapter_image = [ip_adapter_image]

	if len(ip_adapter_image) != len(unet.encoder_hid_proj.image_projection_layers):
	raise ValueError(
	f"`ip_adapter_image` must have same length as the number of IP Adapters. Got {len(ip_adapter_image)} images and {len(unet.encoder_hid_proj.image_projection_layers)} IP Adapters."
	)

	image_embeds = []
	for single_ip_adapter_image, image_proj_layer in zip(
	ip_adapter_image, unet.encoder_hid_proj.image_projection_layers
	):
	output_hidden_state = not isinstance(image_proj_layer, ImageProjection)
	single_image_embeds, single_negative_image_embeds = encode_image(
	image_encoder,
	feature_extractor,
	single_ip_adapter_image,
	device,
	1,
	output_hidden_state,
	negative_image=ip_adapter_negative_image,
	)
	single_image_embeds = torch.stack([single_image_embeds] * num_images_per_prompt, dim=0)
	single_negative_image_embeds = torch.stack([single_negative_image_embeds] * num_images_per_prompt, dim=0)

	if do_classifier_free_guidance:
	single_image_embeds = torch.cat([single_negative_image_embeds, single_image_embeds])
	single_image_embeds = single_image_embeds.to(device)

	image_embeds.append(single_image_embeds)

	return image_embeds


	vae = AutoencoderKL.from_pretrained(
	"madebyollin/sdxl-vae-fp16-fix",
	torch_dtype=torch.float16,
	).to("cuda")

	pipeline = StableDiffusionXLPipeline.from_pretrained(
	"RunDiffusion/Juggernaut-XL-v9",
	torch_dtype=torch.float16,
	vae=vae,
	variant="fp16",
	).to("cuda")

	pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config)
	pipeline.scheduler.config.use_karras_sigmas = True

	pipeline.load_ip_adapter(
	"h94/IP-Adapter",
	subfolder="sdxl_models",
	weight_name="ip-adapter-plus_sdxl_vit-h.safetensors",
	image_encoder_folder="models/image_encoder",
	)
	pipeline.set_ip_adapter_scale(0.7)

	ip_image = load_image("source.png")
	negative_ip_image = load_image("noise.png")

	image_embeds = prepare_ip_adapter_image_embeds(
	unet=pipeline.unet,
	image_encoder=pipeline.image_encoder,
	feature_extractor=pipeline.feature_extractor,
	ip_adapter_image=[[ip_image]],
	do_classifier_free_guidance=True,
	device="cuda",
	num_images_per_prompt=1,
	ip_adapter_negative_image=negative_ip_image,
	)


	prompt = "cinematic photo of a cyborg in the city, 4k, high quality, intricate, highly detailed"
	negative_prompt = "blurry, smooth, plastic"

	image = pipeline(
	prompt=prompt,
	negative_prompt=negative_prompt,
	ip_adapter_image_embeds=image_embeds,
	guidance_scale=6.0,
	num_inference_steps=25,
	generator=torch.Generator(device="cpu").manual_seed(1556265306),
	).images[0]

	image.save("result.png")
	```

	### Asymmetric Tiling
	Stable Diffusion is not trained to generate seamless textures. However, you can use this simple script to add tiling to your generation. This script is contributed by [alexisrolland](https://github.com/alexisrolland). See more details in the [this issue](https://github.com/huggingface/diffusers/issues/556)


	\|Generated\|Tiled\|
	\|---\|---\|
	\|![20240313003235_573631814](https://github.com/huggingface/diffusers/assets/5442875/eca174fb-06a4-464e-a3a7-00dbb024543e)\|![wall](https://github.com/huggingface/diffusers/assets/5442875/b4aa774b-2a6a-4316-a8eb-8f30b5f4d024)\|


	```py
	import torch
	from typing import Optional
	from diffusers import StableDiffusionPipeline
	from diffusers.models.lora import LoRACompatibleConv

	def seamless_tiling(pipeline, x_axis, y_axis):
	def asymmetric_conv2d_convforward(self, input: torch.Tensor, weight: torch.Tensor, bias: Optional[torch.Tensor] = None):
	self.paddingX = (self._reversed_padding_repeated_twice[0], self._reversed_padding_repeated_twice[1], 0, 0)
	self.paddingY = (0, 0, self._reversed_padding_repeated_twice[2], self._reversed_padding_repeated_twice[3])
	working = torch.nn.functional.pad(input, self.paddingX, mode=x_mode)
	working = torch.nn.functional.pad(working, self.paddingY, mode=y_mode)
	return torch.nn.functional.conv2d(working, weight, bias, self.stride, torch.nn.modules.utils._pair(0), self.dilation, self.groups)
	x_mode = 'circular' if x_axis else 'constant'
	y_mode = 'circular' if y_axis else 'constant'
	targets = [pipeline.vae, pipeline.text_encoder, pipeline.unet]
	convolution_layers = []
	for target in targets:
	for module in target.modules():
	if isinstance(module, torch.nn.Conv2d):
	convolution_layers.append(module)
	for layer in convolution_layers:
	if isinstance(layer, LoRACompatibleConv) and layer.lora_layer is None:
	layer.lora_layer = lambda * x: 0
	layer._conv_forward = asymmetric_conv2d_convforward.__get__(layer, torch.nn.Conv2d)
	return pipeline

	pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True)
	pipeline.enable_model_cpu_offload()
	prompt = ["texture of a red brick wall"]
	seed = 123456
	generator = torch.Generator(device='cuda').manual_seed(seed)

	pipeline = seamless_tiling(pipeline=pipeline, x_axis=True, y_axis=True)
	image = pipeline(
	prompt=prompt,
	width=512,
	height=512,
	num_inference_steps=20,
	guidance_scale=7,
	num_images_per_prompt=1,
	generator=generator
	).images[0]
	seamless_tiling(pipeline=pipeline, x_axis=False, y_axis=False)

	torch.cuda.empty_cache()
	image.save('image.png')
	```