from typing import Dict, List, Any from PIL import Image from io import BytesIO import torch import base64 from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') class EndpointHandler(): def __init__(self, path=""): model_id = "timbrooks/instruct-pix2pix" self.pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32, safety_checker=None) self.pipe.to(device) self.pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(self.pipe.scheduler.config) def __call__(self, data: Dict[str, Any]) -> List[Dict[str, Any]]: """ data args: inputs (:obj:`string`) parameters (:obj:) Return: A :obj:`string`:. image string """ image_data = data.pop('inputs', data) # decode base64 image to PIL image = Image.open(BytesIO(base64.b64decode(image_data))) parameters = data.pop('parameters', []) prompt = parameters.pop('prompt', None) negative_prompt = parameters.pop('negative_prompt', None) num_inference_steps = parameters.pop('num_inference_steps', 10) image_guidance_scale = parameters.pop('image_guidance_scale', 1.5) guidance_scale = parameters.pop('guidance_scale', 7.5) images = self.pipe( prompt, image = image, negative_prompt = negative_prompt, num_inference_steps = num_inference_steps, image_guidance_scale = image_guidance_scale, guidance_scale = guidance_scale ).images return images[0]