Spaces:

Shitao
/

OmniGen

Running on Zero

App Files Files Community

Shitao commited on 2 days ago

Commit

690eecc

•

1 Parent(s): bc7d1d7

Update app.py

Browse files

Files changed (1) hide show

app.py +34 -23

app.py CHANGED Viewed

@@ -10,7 +10,7 @@ pipe = OmniGenPipeline.from_pretrained(
     "Shitao/OmniGen-v1"
 )
-@spaces.GPU(duration=200)
 def generate_image(text, img1, img2, img3, height, width, guidance_scale, img_guidance_scale, inference_steps, seed, separate_cfg_infer, offload_model,
             use_input_image_size_as_output, max_input_image_size, randomize_seed):
     input_images = [img1, img2, img3]
@@ -57,6 +57,7 @@ def get_example():
             0,
             1024,
             False,
         ],
         [
             "The woman in <img><|image_1|></img> waves her hand happily in the crowd",
@@ -70,6 +71,7 @@ def get_example():
             128,
             1024,
             False,
         ],
         [
             "A man in a black shirt is reading a book. The man is the right man in <img><|image_1|></img>.",
@@ -83,9 +85,10 @@ def get_example():
             0,
             1024,
             False,
         ],
         [
-            "Two woman are raising fried chicken legs in a bar. A woman is <img><|image_1|></img>. The other woman is <img><|image_2|></img>.",
             "./imgs/test_cases/mckenna.jpg",
             "./imgs/test_cases/Amanda.jpg",
             None,
@@ -93,9 +96,10 @@ def get_example():
             1024,
             2.5,
             1.8,
-            168,
             1024,
             False,
         ],
         [
             "A man and a short-haired woman with a wrinkled face are standing in front of a bookshelf in a library. The man is the man in the middle of <img><|image_1|></img>, and the woman is oldest woman in <img><|image_2|></img>",
@@ -109,6 +113,7 @@ def get_example():
             60,
             1024,
             False,
         ],
         [
             "A man and a woman are sitting at a classroom desk. The man is the man with yellow hair in <img><|image_1|></img>. The woman is the woman on the left of <img><|image_2|></img>",
@@ -122,9 +127,10 @@ def get_example():
             66,
             1024,
             False,
         ],
         [
-            "The flower <img><|image_1|><\/img> is placed in the vase which is in the middle of <img><|image_2|><\/img> on a wooden table of a living room",
             "./imgs/test_cases/rose.jpg",
             "./imgs/test_cases/vase.jpg",
             None,
@@ -135,6 +141,7 @@ def get_example():
             0,
             1024,
             False,
         ],
         [
             "<img><|image_1|><img>\n Remove the woman's earrings. Replace the mug with a clear glass filled with sparkling iced cola.",
@@ -148,71 +155,77 @@ def get_example():
             222,
             1024,
             False,
         ],
         [
             "Detect the skeleton of human in this image: <img><|image_1|></img>.",
             "./imgs/test_cases/control.jpg",
             None,
             None,
-            None,
-            None,
             2.0,
             1.6,
             0,
             1024,
             False,
         ],
         [
             "Generate a new photo using the following picture and text as conditions: <img><|image_1|><img>\n A young boy is sitting on a sofa in the library, holding a book. His hair is neatly combed, and a faint smile plays on his lips, with a few freckles scattered across his cheeks. The library is quiet, with rows of shelves filled with books stretching out behind him.",
             "./imgs/demo_cases/skeletal.png",
             None,
             None,
-            None,
-            None,
             2,
             1.6,
-            42,
             1024,
             False,
         ],
         [
             "Following the pose of this image <img><|image_1|><img>, generate a new photo: A young boy is sitting on a sofa in the library, holding a book. His hair is neatly combed, and a faint smile plays on his lips, with a few freckles scattered across his cheeks. The library is quiet, with rows of shelves filled with books stretching out behind him.",
             "./imgs/demo_cases/edit.png",
             None,
             None,
-            None,
-            None,
             2.0,
             1.6,
             123,
             1024,
             False,
         ],
         [
             "Following the depth mapping of this image <img><|image_1|><img>, generate a new photo: A young girl is sitting on a sofa in the library, holding a book. His hair is neatly combed, and a faint smile plays on his lips, with a few freckles scattered across his cheeks. The library is quiet, with rows of shelves filled with books stretching out behind him.",
             "./imgs/demo_cases/edit.png",
             None,
             None,
-            None,
-            None,
             2.0,
             1.6,
             1,
             1024,
             False,
         ],
         [
-            "<img><|image_1|><\/img> What item can be used to see the current time? Please remove it.",
             "./imgs/test_cases/watch.jpg",
             None,
             None,
-            None,
-            None,
             2.5,
             1.6,
-            0,
             1024,
             False,
         ],
         [
             "According to the following examples, generate an output for the input.\nInput: <img><|image_1|></img>\nOutput: <img><|image_2|></img>\n\nInput: <img><|image_3|></img>\nOutput: ",
@@ -226,16 +239,16 @@ def get_example():
             1,
             768,
             False,
         ],
     ]
     return case
-def run_for_examples(text, img1, img2, img3, height, width, guidance_scale, img_guidance_scale, seed, max_input_image_size, randomize_seed):
     # 在函数内部设置默认值
     inference_steps = 50
     separate_cfg_infer = True
     offload_model = False
-    use_input_image_size_as_output = False
     return generate_image(
         text, img1, img2, img3, height, width, guidance_scale, img_guidance_scale,
@@ -248,7 +261,6 @@ OmniGen is a unified image generation model that you can use to perform various
 For multi-modal to image generation, you should pass a string as `prompt`, and a list of image paths as `input_images`. The placeholder in the prompt should be in the format of `<img><|image_*|></img>` (for the first image, the placeholder is <img><|image_1|></img>. for the second image, the the placeholder is <img><|image_2|></img>).
 For example, use an image of a woman to generate a new image:
 prompt = "A woman holds a bouquet of flowers and faces the camera. Thw woman is \<img\>\<|image_1|\>\</img\>."
 Tips:
 - For image editing task and controlnet task, we recommend setting the height and width of output image as the same as input image. For example, if you want to edit a 512x512 image, you should set the height and width of output image as 512x512. You also can set the `use_input_image_size_as_output` to automatically set the height and width of output image as the same as input image.
 - For out-of-memory or time cost, you can set `offload_model=True` or refer to [./docs/inference.md#requiremented-resources](https://github.com/VectorSpaceLab/OmniGen/blob/main/docs/inference.md#requiremented-resources) to select a appropriate setting.
@@ -258,10 +270,7 @@ Tips:
 - Animate Style: If the generated images are in animate style, you can try to add `photo` to the prompt`.
 - Edit generated image. If you generate an image by omnigen and then want to edit it, you cannot use the same seed to edit this image. For example, use seed=0 to generate image, and should use seed=1 to edit this image.
 - For image editing tasks, we recommend placing the image before the editing instruction. For example, use `<img><|image_1|></img> remove suit`, rather than `remove suit <img><|image_1|></img>`.
 **HF Spaces often encounter errors due to quota limitations, so recommend to run it locally.**
 """
 article = """
@@ -385,6 +394,7 @@ with gr.Blocks() as demo:
             seed_input,
             max_input_image_size,
             randomize_seed,
         ],
         outputs=output_image,
     )
@@ -393,3 +403,4 @@ with gr.Blocks() as demo:
 # launch
 demo.launch()

     "Shitao/OmniGen-v1"
 )
+@spaces.GPU(duration=180)
 def generate_image(text, img1, img2, img3, height, width, guidance_scale, img_guidance_scale, inference_steps, seed, separate_cfg_infer, offload_model,
             use_input_image_size_as_output, max_input_image_size, randomize_seed):
     input_images = [img1, img2, img3]
             0,
             1024,
             False,
+            False,
         ],
         [
             "The woman in <img><|image_1|></img> waves her hand happily in the crowd",
             128,
             1024,
             False,
+            False,
         ],
         [
             "A man in a black shirt is reading a book. The man is the right man in <img><|image_1|></img>.",
             0,
             1024,
             False,
+            False,
         ],
         [
+            "Two woman are raising fried chicken legs in a bar. A woman is <img><|image_1|></img>. Another woman is <img><|image_2|></img>.",
             "./imgs/test_cases/mckenna.jpg",
             "./imgs/test_cases/Amanda.jpg",
             None,
             1024,
             2.5,
             1.8,
+            65,
             1024,
             False,
+            False,
         ],
         [
             "A man and a short-haired woman with a wrinkled face are standing in front of a bookshelf in a library. The man is the man in the middle of <img><|image_1|></img>, and the woman is oldest woman in <img><|image_2|></img>",
             60,
             1024,
             False,
+            False,
         ],
         [
             "A man and a woman are sitting at a classroom desk. The man is the man with yellow hair in <img><|image_1|></img>. The woman is the woman on the left of <img><|image_2|></img>",
             66,
             1024,
             False,
+            False,
         ],
         [
+            "The flower <img><|image_1|></img> is placed in the vase which is in the middle of <img><|image_2|></img> on a wooden table of a living room",
             "./imgs/test_cases/rose.jpg",
             "./imgs/test_cases/vase.jpg",
             None,
             0,
             1024,
             False,
+            False,
         ],
         [
             "<img><|image_1|><img>\n Remove the woman's earrings. Replace the mug with a clear glass filled with sparkling iced cola.",
             222,
             1024,
             False,
+            True,
         ],
         [
             "Detect the skeleton of human in this image: <img><|image_1|></img>.",
             "./imgs/test_cases/control.jpg",
             None,
             None,
+            1024,
+            1024,
             2.0,
             1.6,
             0,
             1024,
             False,
+            True,
         ],
         [
             "Generate a new photo using the following picture and text as conditions: <img><|image_1|><img>\n A young boy is sitting on a sofa in the library, holding a book. His hair is neatly combed, and a faint smile plays on his lips, with a few freckles scattered across his cheeks. The library is quiet, with rows of shelves filled with books stretching out behind him.",
             "./imgs/demo_cases/skeletal.png",
             None,
             None,
+            1024,
+            1024,
             2,
             1.6,
+            999,
             1024,
             False,
+            True,
         ],
         [
             "Following the pose of this image <img><|image_1|><img>, generate a new photo: A young boy is sitting on a sofa in the library, holding a book. His hair is neatly combed, and a faint smile plays on his lips, with a few freckles scattered across his cheeks. The library is quiet, with rows of shelves filled with books stretching out behind him.",
             "./imgs/demo_cases/edit.png",
             None,
             None,
+            1024,
+            1024,
             2.0,
             1.6,
             123,
             1024,
             False,
+            True,
         ],
         [
             "Following the depth mapping of this image <img><|image_1|><img>, generate a new photo: A young girl is sitting on a sofa in the library, holding a book. His hair is neatly combed, and a faint smile plays on his lips, with a few freckles scattered across his cheeks. The library is quiet, with rows of shelves filled with books stretching out behind him.",
             "./imgs/demo_cases/edit.png",
             None,
             None,
+            1024,
+            1024,
             2.0,
             1.6,
             1,
             1024,
             False,
+            True,
         ],
         [
+            "<img><|image_1|><\/img> What item can be used to see the current time? Please highlight it in blue.",
             "./imgs/test_cases/watch.jpg",
             None,
             None,
+            1024,
+            1024,
             2.5,
             1.6,
+            666,
             1024,
             False,
+            True,
         ],
         [
             "According to the following examples, generate an output for the input.\nInput: <img><|image_1|></img>\nOutput: <img><|image_2|></img>\n\nInput: <img><|image_3|></img>\nOutput: ",
             1,
             768,
             False,
+            False,
         ],
     ]
     return case
+def run_for_examples(text, img1, img2, img3, height, width, guidance_scale, img_guidance_scale, seed, max_input_image_size, randomize_seed, use_input_image_size_as_output):
     # 在函数内部设置默认值
     inference_steps = 50
     separate_cfg_infer = True
     offload_model = False
     return generate_image(
         text, img1, img2, img3, height, width, guidance_scale, img_guidance_scale,
 For multi-modal to image generation, you should pass a string as `prompt`, and a list of image paths as `input_images`. The placeholder in the prompt should be in the format of `<img><|image_*|></img>` (for the first image, the placeholder is <img><|image_1|></img>. for the second image, the the placeholder is <img><|image_2|></img>).
 For example, use an image of a woman to generate a new image:
 prompt = "A woman holds a bouquet of flowers and faces the camera. Thw woman is \<img\>\<|image_1|\>\</img\>."
 Tips:
 - For image editing task and controlnet task, we recommend setting the height and width of output image as the same as input image. For example, if you want to edit a 512x512 image, you should set the height and width of output image as 512x512. You also can set the `use_input_image_size_as_output` to automatically set the height and width of output image as the same as input image.
 - For out-of-memory or time cost, you can set `offload_model=True` or refer to [./docs/inference.md#requiremented-resources](https://github.com/VectorSpaceLab/OmniGen/blob/main/docs/inference.md#requiremented-resources) to select a appropriate setting.
 - Animate Style: If the generated images are in animate style, you can try to add `photo` to the prompt`.
 - Edit generated image. If you generate an image by omnigen and then want to edit it, you cannot use the same seed to edit this image. For example, use seed=0 to generate image, and should use seed=1 to edit this image.
 - For image editing tasks, we recommend placing the image before the editing instruction. For example, use `<img><|image_1|></img> remove suit`, rather than `remove suit <img><|image_1|></img>`.
 **HF Spaces often encounter errors due to quota limitations, so recommend to run it locally.**
 """
 article = """
             seed_input,
             max_input_image_size,
             randomize_seed,
+            use_input_image_size_as_output,
         ],
         outputs=output_image,
     )
 # launch
 demo.launch()