Spaces:

longlian
/

llm-grounded-diffusion

Running on T4

App Files Files Community

Tony Lian commited on Jun 19, 2023

Commit

a55a1c5

•

1 Parent(s): f1f8842

Update the gradio layouts

Browse files

Files changed (47) hide show

README.md +1 -1
app.py +17 -25
gradio_cached_examples/15/log.csv +5 -5
gradio_cached_examples/39/Generated image/0a6be0bd-cbca-430f-b8ea-5ec8a0cf32f4/19eecb28178a417970470d31103e86f52a1079b6/image.png +0 -0
gradio_cached_examples/39/Generated image/0a6be0bd-cbca-430f-b8ea-5ec8a0cf32f4/captions.json +0 -1
gradio_cached_examples/39/Generated image/32ac0e0e-135a-404c-a1ee-53fdbc919db6/6e96ae22067936bf00a4f9f9415775f561fd0152/image.png +0 -0
gradio_cached_examples/39/Generated image/32ac0e0e-135a-404c-a1ee-53fdbc919db6/captions.json +1 -0
gradio_cached_examples/39/Generated image/5541f42f-a5c4-4c90-ae9c-389d0f0ea11a/captions.json +1 -0
gradio_cached_examples/39/Generated image/5541f42f-a5c4-4c90-ae9c-389d0f0ea11a/e8933d4d2aff4203da4600fd6eb763a04c8667ff/image.png +0 -0
gradio_cached_examples/39/Generated image/5818e92b-be12-44af-a000-022499aab645/72788356baeec6ff0f3614c57621d60a801b6a7f/image.png +0 -0
gradio_cached_examples/39/Generated image/5818e92b-be12-44af-a000-022499aab645/captions.json +0 -1
gradio_cached_examples/39/Generated image/7dbf49b5-a987-4285-9ecb-899fc0897489/1ba27d75ea6c232428e503a0336d8eb3c346c0b3/image.png +0 -0
gradio_cached_examples/39/Generated image/7dbf49b5-a987-4285-9ecb-899fc0897489/captions.json +1 -0
gradio_cached_examples/39/Generated image/92613514-ef71-44f5-807d-84a494dedeb1/672846c3033c99ea94199567efdb1955ee5ab7ce/image.png +0 -0
gradio_cached_examples/39/Generated image/92613514-ef71-44f5-807d-84a494dedeb1/captions.json +0 -1
gradio_cached_examples/39/Generated image/ae08bef2-f889-441a-ba1e-026445bb386a/1a312139177423e79631a7bf40aa1ac531efb744/image.png +0 -0
gradio_cached_examples/39/Generated image/ae08bef2-f889-441a-ba1e-026445bb386a/captions.json +1 -0
gradio_cached_examples/39/Generated image/c048f5e9-7f96-4da7-823d-3a898a4eac92/57a1c5b1ccb262cea6f1ae86fa5e70c89d379a6f/image.png +0 -0
gradio_cached_examples/39/Generated image/c048f5e9-7f96-4da7-823d-3a898a4eac92/captions.json +0 -1
gradio_cached_examples/39/Generated image/d216beac-010e-4466-856c-9d92e471654c/90a51edff815fd0aaef1864d6784583e800be8d8/image.png +0 -0
gradio_cached_examples/39/Generated image/d216beac-010e-4466-856c-9d92e471654c/captions.json +1 -0
gradio_cached_examples/39/Generated image/e05fc15c-d202-4cb4-b235-6b48d03ef03b/8821c44e2875b2e5fd9d9173c6b6bf6a5267be08/image.png +0 -0
gradio_cached_examples/39/Generated image/e05fc15c-d202-4cb4-b235-6b48d03ef03b/captions.json +0 -1
gradio_cached_examples/39/log.csv +5 -5
gradio_cached_examples/49/Generated image/569b2539-1b09-422e-8f04-28e85cb5ce6b/79b47dee4bf06f02baaddf31631dadf4f0a77b1b/image.png +0 -0
gradio_cached_examples/49/Generated image/569b2539-1b09-422e-8f04-28e85cb5ce6b/captions.json +1 -0
gradio_cached_examples/49/Generated image/7ca4de19-dacd-433a-9bda-44a30411773a/captions.json +1 -0
gradio_cached_examples/49/Generated image/7ca4de19-dacd-433a-9bda-44a30411773a/da41c41cef06d8895f87bd51bccacb9e5ee6fc13/image.png +0 -0
gradio_cached_examples/49/Generated image/9d74cf63-2741-4aa1-9b9d-284ce36b1272/916b46e1b9e7e59a0f42ea2e0e9d3ac2077ddb29/image.png +0 -0
gradio_cached_examples/49/Generated image/9d74cf63-2741-4aa1-9b9d-284ce36b1272/captions.json +1 -0
gradio_cached_examples/49/Generated image/d1cff19c-eda7-411a-97bd-598780ee1514/111213a2bec11fbeb98d5cf421ff3f1e90ac2a6f/image.png +0 -0
gradio_cached_examples/49/Generated image/d1cff19c-eda7-411a-97bd-598780ee1514/captions.json +1 -0
gradio_cached_examples/49/Generated image/ff249b87-f078-4ed7-b702-d9c026c2ae0b/30ac54337ceb5917e94befaaa6939bdb2970ea50/image.png +0 -0
gradio_cached_examples/49/Generated image/ff249b87-f078-4ed7-b702-d9c026c2ae0b/captions.json +1 -0
gradio_cached_examples/49/log.csv +6 -0
gradio_cached_examples/51/Generated image/52711207-5d80-4eb1-abd1-7ca09ae82f7d/91b5f67cc8cf5b4a8fd2aea741f4175606bbe7b5/image.png +0 -0
gradio_cached_examples/51/Generated image/52711207-5d80-4eb1-abd1-7ca09ae82f7d/captions.json +0 -1
gradio_cached_examples/51/Generated image/6a5728a0-b580-4114-8c1c-7a3313fcad79/6b704ebfdeabbdcc40397de5d1d12ab6e6c167a6/image.png +0 -0
gradio_cached_examples/51/Generated image/6a5728a0-b580-4114-8c1c-7a3313fcad79/captions.json +0 -1
gradio_cached_examples/51/Generated image/8e44d54e-7b4a-46ec-aacc-67ef88a61505/7778fc9077843c4de514cab097cc5ce9d689be7d/image.png +0 -0
gradio_cached_examples/51/Generated image/8e44d54e-7b4a-46ec-aacc-67ef88a61505/captions.json +0 -1
gradio_cached_examples/51/Generated image/98ce623d-5866-46ae-8e57-4690871fa04f/captions.json +0 -1
gradio_cached_examples/51/Generated image/98ce623d-5866-46ae-8e57-4690871fa04f/da7f62f9d0ef44ec431cec8a80a9eabfea6794ab/image.png +0 -0
gradio_cached_examples/51/Generated image/b9462897-294a-42b2-9cd6-89d348b707fc/5981833148e050fc2bbd906d2a91cc66a2782ac0/image.png +0 -0
gradio_cached_examples/51/Generated image/b9462897-294a-42b2-9cd6-89d348b707fc/captions.json +0 -1
gradio_cached_examples/51/log.csv +0 -6
requirements.txt +1 -1

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ emoji: 😊
 colorFrom: red
 colorTo: pink
 sdk: gradio
-sdk_version: 3.34.0
 app_file: app.py
 pinned: true
 tags: [llm, diffusion, grounding, grounded, llm-grounded, text-to-image, language, large language models, layout, generation, generative, customization, personalization, prompting, chatgpt, gpt-3.5, gpt-4]

 colorFrom: red
 colorTo: pink
 sdk: gradio
+sdk_version: 3.35.2
 app_file: app.py
 pinned: true
 tags: [llm, diffusion, grounding, grounded, llm-grounded, text-to-image, language, large language models, layout, generation, generative, customization, personalization, prompting, chatgpt, gpt-3.5, gpt-4]

app.py CHANGED Viewed

@@ -202,9 +202,11 @@ html = f"""<h1>LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to
             <p>2. You can perform multi-round specification by giving ChatGPT follow-up requests (e.g., make the object boxes bigger).</p>
             <p>3. You can also try prompts in Simplified Chinese. If you want to try prompts in another language, translate the first line of last example to your language.</p>
             <p>4. The diffusion model only runs 20 steps by default. You can make it run 50 steps to get higher quality images (or tweak frozen steps/guidance steps for better guidance and coherence).</p>
-            <p>5. Duplicate this space and add GPU to skip the queue and run our model faster. (Currently we are using a T4, and you can add a A10G to make it 5x faster) {duplicate_html}</p>
             <br/>
-            <p>Implementation note: In this demo, we replace the attention manipulation in our layout-guided Stable Diffusion described in our paper with GLIGEN due to much faster inference speed (<b>FlashAttention supported, no backprop needed</b> during inference). Compared to vanilla GLIGEN, we have better coherence. Other parts of text-to-image pipeline, including single object generation and SAM, remain the same. The settings and examples in the prompt are simplified in this demo.</p>"""
 with gr.Blocks(
     title="LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models"
@@ -214,11 +216,11 @@ with gr.Blocks(
         with gr.Row():
             with gr.Column(scale=1):
                 prompt = gr.Textbox(lines=2, label="Prompt for Layout Generation", placeholder=prompt_placeholder)
-                generate_btn = gr.Button("Generate Prompt", variant='primary')
                 with gr.Accordion("Advanced options", open=False):
                     template = gr.Textbox(lines=10, label="Custom Template", placeholder="Customized Template", value=default_template)
             with gr.Column(scale=1):
-                output = gr.Textbox(label="Paste this into ChatGPT (GPT-4 preferred; on Mac, click text and press Command+A and Command+C to copy all)")
                 gr.HTML("<a href='https://chat.openai.com' target='_blank'>Click here to open ChatGPT</a>")
         generate_btn.click(fn=get_lmd_prompt, inputs=[prompt, template], outputs=output, api_name="get_lmd_prompt")
@@ -230,26 +232,15 @@ with gr.Blocks(
             cache_examples=True
         )
-    # with gr.Tab("(Optional) Visualize ChatGPT-generated Layout"):
-    #     with gr.Row():
-    #         with gr.Column(scale=1):
-    #             response = gr.Textbox(lines=5, label="Paste ChatGPT response here", placeholder=layout_placeholder)
-    #             visualize_btn = gr.Button("Visualize Layout")
-    #         with gr.Column(scale=1):
-    #             output = gr.Image(shape=(512, 512), elem_classes="img", elem_id="img", css="img {width: 300px}")
-    #     visualize_btn.click(fn=get_layout_image, inputs=response, outputs=output, api_name="visualize-layout")
     with gr.Tab("Stage 2 (New). Layout to Image generation"):
         with gr.Row():
             with gr.Column(scale=1):
-                response = gr.Textbox(lines=5, label="Paste ChatGPT response here (no original caption needed)", placeholder=layout_placeholder)
-                visualize_btn = gr.Button("Visualize Layout")
-                generate_btn = gr.Button("Generate Image from Layout", variant='primary')
                 with gr.Accordion("Advanced options (play around for better generation)", open=False):
-                    overall_prompt_override = gr.Textbox(lines=2, label="Prompt for overall generation (you can put your input prompt for layout generation here, helpful if your scene cannot be represented by background prompt and boxes, such as with object interactions; if left empty: background prompt with [objects])", value="")
                     frozen_step_ratio = gr.Slider(0, 1, value=0.4, step=0.1, label="Foreground frozen steps ratio (higher: preserve object attributes; lower: higher coherence; set to 0: (almost) equivalent to vanilla GLIGEN except details)")
                     gligen_scheduled_sampling_beta = gr.Slider(0, 1, value=0.3, step=0.1, label="GLIGEN guidance steps ratio (the beta value)")
-                    seed = gr.Slider(0, 10000, value=0, step=1, label="Seed")
                     num_inference_steps = gr.Slider(1, 50, value=20, step=1, label="Number of inference steps")
                     dpm_scheduler = gr.Checkbox(label="Use DPM scheduler (unchecked: DDIM scheduler, may have better coherence, recommend 50 inference steps)", show_label=False, value=True)
                     fg_seed_start = gr.Slider(0, 10000, value=20, step=1, label="Seed for foreground variation")
@@ -258,10 +249,12 @@ with gr.Blocks(
                     overall_negative_prompt = gr.Textbox(lines=1, label="Negative prompt for overall generation", value=DEFAULT_OVERALL_NEGATIVE_PROMPT)
                     show_so_imgs = gr.Checkbox(label="Show annotated single object generations", show_label=False, value=False)
                     scale_boxes = gr.Checkbox(label="Scale bounding boxes to just fit the scene", show_label=False, value=False)
             with gr.Column(scale=1):
                 gallery = gr.Gallery(
-                    label="Generated image", show_label=False, elem_id="gallery"
-                ).style(columns=[1], rows=[1], object_fit="contain", preview=True)
         visualize_btn.click(fn=get_layout_image_gallery, inputs=response, outputs=gallery, api_name="visualize-layout")
         generate_btn.click(fn=get_ours_image, inputs=[response, seed, num_inference_steps, dpm_scheduler, overall_prompt_override, fg_seed_start, fg_blending_ratio, frozen_step_ratio, gligen_scheduled_sampling_beta, so_negative_prompt, overall_negative_prompt, show_so_imgs, scale_boxes], outputs=gallery, api_name="layout-to-image")
@@ -277,15 +270,14 @@ with gr.Blocks(
         with gr.Row():
             with gr.Column(scale=1):
                 sd_prompt = gr.Textbox(lines=2, label="Prompt for baseline SD", placeholder=prompt_placeholder)
-                generate_btn = gr.Button("Generate")
-                with gr.Accordion("Advanced options", open=False):
-                    seed = gr.Slider(0, 10000, value=0, step=1, label="Seed")
             # with gr.Column(scale=1):
             #     output = gr.Image(shape=(512, 512), elem_classes="img", elem_id="img")
             with gr.Column(scale=1):
                 gallery = gr.Gallery(
-                    label="Generated image", show_label=False, elem_id="gallery2"
-                ).style(columns=[1], rows=[1], object_fit="contain", preview=True)
         generate_btn.click(fn=get_baseline_image, inputs=[sd_prompt, seed], outputs=gallery, api_name="baseline")
         gr.Examples(

             <p>2. You can perform multi-round specification by giving ChatGPT follow-up requests (e.g., make the object boxes bigger).</p>
             <p>3. You can also try prompts in Simplified Chinese. If you want to try prompts in another language, translate the first line of last example to your language.</p>
             <p>4. The diffusion model only runs 20 steps by default. You can make it run 50 steps to get higher quality images (or tweak frozen steps/guidance steps for better guidance and coherence).</p>
+            <p>5. Duplicate this space and add GPU or clone the space and run locally to skip the queue and run our model faster. (Currently we are using a T4, and you can add a A10G to make it 5x faster) {duplicate_html}</p>
             <br/>
+            <p>Implementation note: In this demo, we replace the attention manipulation in our layout-guided Stable Diffusion described in our paper with GLIGEN due to much faster inference speed (<b>FlashAttention supported, no backprop needed</b> during inference). Compared to vanilla GLIGEN, we have better coherence. Other parts of text-to-image pipeline, including single object generation and SAM, remain the same. The settings and examples in the prompt are simplified in this demo.</p>
+            <style>.btn {{flex-grow: unset !important;}} </style>
+            """
 with gr.Blocks(
     title="LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models"
         with gr.Row():
             with gr.Column(scale=1):
                 prompt = gr.Textbox(lines=2, label="Prompt for Layout Generation", placeholder=prompt_placeholder)
+                generate_btn = gr.Button("Generate Prompt", variant='primary', elem_classes="btn")
                 with gr.Accordion("Advanced options", open=False):
                     template = gr.Textbox(lines=10, label="Custom Template", placeholder="Customized Template", value=default_template)
             with gr.Column(scale=1):
+                output = gr.Textbox(label="Paste this into ChatGPT (GPT-4 preferred; on Mac, click text and press Command+A and Command+C to copy all)", show_copy_button=True)
                 gr.HTML("<a href='https://chat.openai.com' target='_blank'>Click here to open ChatGPT</a>")
         generate_btn.click(fn=get_lmd_prompt, inputs=[prompt, template], outputs=output, api_name="get_lmd_prompt")
             cache_examples=True
         )
     with gr.Tab("Stage 2 (New). Layout to Image generation"):
         with gr.Row():
             with gr.Column(scale=1):
+                response = gr.Textbox(lines=8, label="Paste ChatGPT response here (no original caption needed)", placeholder=layout_placeholder)
+                overall_prompt_override = gr.Textbox(lines=2, label="Prompt for overall generation (optional but recommended)", placeholder="You can put your input prompt for layout generation here, helpful if your scene cannot be represented by background prompt and boxes only, e.g., with object interactions. If left empty: background prompt with [objects].", value="")
+                seed = gr.Slider(0, 10000, value=0, step=1, label="Seed")
                 with gr.Accordion("Advanced options (play around for better generation)", open=False):
                     frozen_step_ratio = gr.Slider(0, 1, value=0.4, step=0.1, label="Foreground frozen steps ratio (higher: preserve object attributes; lower: higher coherence; set to 0: (almost) equivalent to vanilla GLIGEN except details)")
                     gligen_scheduled_sampling_beta = gr.Slider(0, 1, value=0.3, step=0.1, label="GLIGEN guidance steps ratio (the beta value)")
                     num_inference_steps = gr.Slider(1, 50, value=20, step=1, label="Number of inference steps")
                     dpm_scheduler = gr.Checkbox(label="Use DPM scheduler (unchecked: DDIM scheduler, may have better coherence, recommend 50 inference steps)", show_label=False, value=True)
                     fg_seed_start = gr.Slider(0, 10000, value=20, step=1, label="Seed for foreground variation")
                     overall_negative_prompt = gr.Textbox(lines=1, label="Negative prompt for overall generation", value=DEFAULT_OVERALL_NEGATIVE_PROMPT)
                     show_so_imgs = gr.Checkbox(label="Show annotated single object generations", show_label=False, value=False)
                     scale_boxes = gr.Checkbox(label="Scale bounding boxes to just fit the scene", show_label=False, value=False)
+                visualize_btn = gr.Button("Visualize Layout", elem_classes="btn")
+                generate_btn = gr.Button("Generate Image from Layout", variant='primary', elem_classes="btn")
             with gr.Column(scale=1):
                 gallery = gr.Gallery(
+                    label="Generated image", show_label=False, elem_id="gallery", columns=[1], rows=[1], object_fit="contain", preview=True
+                )
         visualize_btn.click(fn=get_layout_image_gallery, inputs=response, outputs=gallery, api_name="visualize-layout")
         generate_btn.click(fn=get_ours_image, inputs=[response, seed, num_inference_steps, dpm_scheduler, overall_prompt_override, fg_seed_start, fg_blending_ratio, frozen_step_ratio, gligen_scheduled_sampling_beta, so_negative_prompt, overall_negative_prompt, show_so_imgs, scale_boxes], outputs=gallery, api_name="layout-to-image")
         with gr.Row():
             with gr.Column(scale=1):
                 sd_prompt = gr.Textbox(lines=2, label="Prompt for baseline SD", placeholder=prompt_placeholder)
+                seed = gr.Slider(0, 10000, value=0, step=1, label="Seed")
+                generate_btn = gr.Button("Generate", elem_classes="btn")
             # with gr.Column(scale=1):
             #     output = gr.Image(shape=(512, 512), elem_classes="img", elem_id="img")
             with gr.Column(scale=1):
                 gallery = gr.Gallery(
+                    label="Generated image", show_label=False, elem_id="gallery2", columns=[1], rows=[1], object_fit="contain", preview=True
+                )
         generate_btn.click(fn=get_baseline_image, inputs=[sd_prompt, seed], outputs=gallery, api_name="baseline")
         gr.Examples(

gradio_cached_examples/15/log.csv CHANGED Viewed

@@ -30,7 +30,7 @@ Objects: [('a tv', [88, 85, 335, 203]), ('a cabinet', [57, 308, 404, 201]), ('a
 Background prompt: An oil painting of a living room scene
 Caption: A realistic photo of a gray cat and an orange dog on the grass.
-Objects: ",,,2023-06-15 12:05:24.528652
 "You are an intelligent bounding box generator. I will provide you with a caption for a photo, image, or painting. Your task is to generate the bounding boxes for the objects mentioned in the caption, along with a background prompt describing the scene. The images are of size 512x512, and the bounding boxes should not overlap or go beyond the image boundaries. Each bounding box should be in the format of (object name, [top-left x coordinate, top-left y coordinate, box width, box height]) and include exactly one object. Make the boxes larger if possible. Do not put objects that are already provided in the bounding boxes into the background prompt. If needed, you can make reasonable guesses. Generate the object descriptions and background prompts in English even if the caption might not be in English. Do not include non-existing or excluded objects in the background prompt. Please refer to the example below for the desired format.
 Caption: A realistic image of landscape scene depicting a green car parking on the left of a blue truck, with a red air balloon and a bird in the sky
@@ -62,7 +62,7 @@ Objects: [('a tv', [88, 85, 335, 203]), ('a cabinet', [57, 308, 404, 201]), ('a
 Background prompt: An oil painting of a living room scene
 Caption: In an indoor scene, a blue cube directly above a red cube with a vase on the left of them.
-Objects: ",,,2023-06-15 12:05:24.529323
 "You are an intelligent bounding box generator. I will provide you with a caption for a photo, image, or painting. Your task is to generate the bounding boxes for the objects mentioned in the caption, along with a background prompt describing the scene. The images are of size 512x512, and the bounding boxes should not overlap or go beyond the image boundaries. Each bounding box should be in the format of (object name, [top-left x coordinate, top-left y coordinate, box width, box height]) and include exactly one object. Make the boxes larger if possible. Do not put objects that are already provided in the bounding boxes into the background prompt. If needed, you can make reasonable guesses. Generate the object descriptions and background prompts in English even if the caption might not be in English. Do not include non-existing or excluded objects in the background prompt. Please refer to the example below for the desired format.
 Caption: A realistic image of landscape scene depicting a green car parking on the left of a blue truck, with a red air balloon and a bird in the sky
@@ -94,7 +94,7 @@ Objects: [('a tv', [88, 85, 335, 203]), ('a cabinet', [57, 308, 404, 201]), ('a
 Background prompt: An oil painting of a living room scene
 Caption: A realistic photo of a wooden table without bananas in an indoor scene
-Objects: ",,,2023-06-15 12:05:24.529876
 "You are an intelligent bounding box generator. I will provide you with a caption for a photo, image, or painting. Your task is to generate the bounding boxes for the objects mentioned in the caption, along with a background prompt describing the scene. The images are of size 512x512, and the bounding boxes should not overlap or go beyond the image boundaries. Each bounding box should be in the format of (object name, [top-left x coordinate, top-left y coordinate, box width, box height]) and include exactly one object. Make the boxes larger if possible. Do not put objects that are already provided in the bounding boxes into the background prompt. If needed, you can make reasonable guesses. Generate the object descriptions and background prompts in English even if the caption might not be in English. Do not include non-existing or excluded objects in the background prompt. Please refer to the example below for the desired format.
 Caption: A realistic image of landscape scene depicting a green car parking on the left of a blue truck, with a red air balloon and a bird in the sky
@@ -126,7 +126,7 @@ Objects: [('a tv', [88, 85, 335, 203]), ('a cabinet', [57, 308, 404, 201]), ('a
 Background prompt: An oil painting of a living room scene
 Caption: A man in red is standing next to another woman in blue in the mountains.
-Objects: ",,,2023-06-15 12:05:24.530394
 "You are an intelligent bounding box generator. I will provide you with a caption for a photo, image, or painting. Your task is to generate the bounding boxes for the objects mentioned in the caption, along with a background prompt describing the scene. The images are of size 512x512, and the bounding boxes should not overlap or go beyond the image boundaries. Each bounding box should be in the format of (object name, [top-left x coordinate, top-left y coordinate, box width, box height]) and include exactly one object. Make the boxes larger if possible. Do not put objects that are already provided in the bounding boxes into the background prompt. If needed, you can make reasonable guesses. Generate the object descriptions and background prompts in English even if the caption might not be in English. Do not include non-existing or excluded objects in the background prompt. Please refer to the example below for the desired format.
 Caption: A realistic image of landscape scene depicting a green car parking on the left of a blue truck, with a red air balloon and a bird in the sky
@@ -158,4 +158,4 @@ Objects: [('a tv', [88, 85, 335, 203]), ('a cabinet', [57, 308, 404, 201]), ('a
 Background prompt: An oil painting of a living room scene
 Caption: 一个室内场景的水彩画，一个桌子上面放着一盘水果
-Objects: ",,,2023-06-15 12:05:24.530906

 Background prompt: An oil painting of a living room scene
 Caption: A realistic photo of a gray cat and an orange dog on the grass.
+Objects: ",,,2023-06-19 12:19:18.120678
 "You are an intelligent bounding box generator. I will provide you with a caption for a photo, image, or painting. Your task is to generate the bounding boxes for the objects mentioned in the caption, along with a background prompt describing the scene. The images are of size 512x512, and the bounding boxes should not overlap or go beyond the image boundaries. Each bounding box should be in the format of (object name, [top-left x coordinate, top-left y coordinate, box width, box height]) and include exactly one object. Make the boxes larger if possible. Do not put objects that are already provided in the bounding boxes into the background prompt. If needed, you can make reasonable guesses. Generate the object descriptions and background prompts in English even if the caption might not be in English. Do not include non-existing or excluded objects in the background prompt. Please refer to the example below for the desired format.
 Caption: A realistic image of landscape scene depicting a green car parking on the left of a blue truck, with a red air balloon and a bird in the sky
 Background prompt: An oil painting of a living room scene
 Caption: In an indoor scene, a blue cube directly above a red cube with a vase on the left of them.
+Objects: ",,,2023-06-19 12:19:18.121279
 "You are an intelligent bounding box generator. I will provide you with a caption for a photo, image, or painting. Your task is to generate the bounding boxes for the objects mentioned in the caption, along with a background prompt describing the scene. The images are of size 512x512, and the bounding boxes should not overlap or go beyond the image boundaries. Each bounding box should be in the format of (object name, [top-left x coordinate, top-left y coordinate, box width, box height]) and include exactly one object. Make the boxes larger if possible. Do not put objects that are already provided in the bounding boxes into the background prompt. If needed, you can make reasonable guesses. Generate the object descriptions and background prompts in English even if the caption might not be in English. Do not include non-existing or excluded objects in the background prompt. Please refer to the example below for the desired format.
 Caption: A realistic image of landscape scene depicting a green car parking on the left of a blue truck, with a red air balloon and a bird in the sky
 Background prompt: An oil painting of a living room scene
 Caption: A realistic photo of a wooden table without bananas in an indoor scene
+Objects: ",,,2023-06-19 12:19:18.121771
 "You are an intelligent bounding box generator. I will provide you with a caption for a photo, image, or painting. Your task is to generate the bounding boxes for the objects mentioned in the caption, along with a background prompt describing the scene. The images are of size 512x512, and the bounding boxes should not overlap or go beyond the image boundaries. Each bounding box should be in the format of (object name, [top-left x coordinate, top-left y coordinate, box width, box height]) and include exactly one object. Make the boxes larger if possible. Do not put objects that are already provided in the bounding boxes into the background prompt. If needed, you can make reasonable guesses. Generate the object descriptions and background prompts in English even if the caption might not be in English. Do not include non-existing or excluded objects in the background prompt. Please refer to the example below for the desired format.
 Caption: A realistic image of landscape scene depicting a green car parking on the left of a blue truck, with a red air balloon and a bird in the sky
 Background prompt: An oil painting of a living room scene
 Caption: A man in red is standing next to another woman in blue in the mountains.
+Objects: ",,,2023-06-19 12:19:18.122219
 "You are an intelligent bounding box generator. I will provide you with a caption for a photo, image, or painting. Your task is to generate the bounding boxes for the objects mentioned in the caption, along with a background prompt describing the scene. The images are of size 512x512, and the bounding boxes should not overlap or go beyond the image boundaries. Each bounding box should be in the format of (object name, [top-left x coordinate, top-left y coordinate, box width, box height]) and include exactly one object. Make the boxes larger if possible. Do not put objects that are already provided in the bounding boxes into the background prompt. If needed, you can make reasonable guesses. Generate the object descriptions and background prompts in English even if the caption might not be in English. Do not include non-existing or excluded objects in the background prompt. Please refer to the example below for the desired format.
 Caption: A realistic image of landscape scene depicting a green car parking on the left of a blue truck, with a red air balloon and a bird in the sky
 Background prompt: An oil painting of a living room scene
 Caption: 一个室内场景的水彩画，一个桌子上面放着一盘水果
+Objects: ",,,2023-06-19 12:19:18.122722

gradio_cached_examples/39/Generated image/0a6be0bd-cbca-430f-b8ea-5ec8a0cf32f4/19eecb28178a417970470d31103e86f52a1079b6/image.png DELETED Viewed

Binary file (376 kB)

gradio_cached_examples/39/Generated image/0a6be0bd-cbca-430f-b8ea-5ec8a0cf32f4/captions.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"./gradio_cached_examples/39/Generated image/0a6be0bd-cbca-430f-b8ea-5ec8a0cf32f4/19eecb28178a417970470d31103e86f52a1079b6/image.png": null}

gradio_cached_examples/39/Generated image/32ac0e0e-135a-404c-a1ee-53fdbc919db6/6e96ae22067936bf00a4f9f9415775f561fd0152/image.png ADDED Viewed

gradio_cached_examples/39/Generated image/32ac0e0e-135a-404c-a1ee-53fdbc919db6/captions.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"./gradio_cached_examples/39/Generated image/32ac0e0e-135a-404c-a1ee-53fdbc919db6/6e96ae22067936bf00a4f9f9415775f561fd0152/image.png": null}

gradio_cached_examples/39/Generated image/5541f42f-a5c4-4c90-ae9c-389d0f0ea11a/captions.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"./gradio_cached_examples/39/Generated image/5541f42f-a5c4-4c90-ae9c-389d0f0ea11a/e8933d4d2aff4203da4600fd6eb763a04c8667ff/image.png": null}

gradio_cached_examples/39/Generated image/5541f42f-a5c4-4c90-ae9c-389d0f0ea11a/e8933d4d2aff4203da4600fd6eb763a04c8667ff/image.png ADDED Viewed

gradio_cached_examples/39/Generated image/5818e92b-be12-44af-a000-022499aab645/72788356baeec6ff0f3614c57621d60a801b6a7f/image.png DELETED Viewed

Binary file (495 kB)

gradio_cached_examples/39/Generated image/5818e92b-be12-44af-a000-022499aab645/captions.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"./gradio_cached_examples/39/Generated image/5818e92b-be12-44af-a000-022499aab645/72788356baeec6ff0f3614c57621d60a801b6a7f/image.png": null}

gradio_cached_examples/39/Generated image/7dbf49b5-a987-4285-9ecb-899fc0897489/1ba27d75ea6c232428e503a0336d8eb3c346c0b3/image.png ADDED Viewed

gradio_cached_examples/39/Generated image/7dbf49b5-a987-4285-9ecb-899fc0897489/captions.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"./gradio_cached_examples/39/Generated image/7dbf49b5-a987-4285-9ecb-899fc0897489/1ba27d75ea6c232428e503a0336d8eb3c346c0b3/image.png": null}

gradio_cached_examples/39/Generated image/92613514-ef71-44f5-807d-84a494dedeb1/672846c3033c99ea94199567efdb1955ee5ab7ce/image.png DELETED Viewed

Binary file (500 kB)

gradio_cached_examples/39/Generated image/92613514-ef71-44f5-807d-84a494dedeb1/captions.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"./gradio_cached_examples/39/Generated image/92613514-ef71-44f5-807d-84a494dedeb1/672846c3033c99ea94199567efdb1955ee5ab7ce/image.png": null}

gradio_cached_examples/39/Generated image/ae08bef2-f889-441a-ba1e-026445bb386a/1a312139177423e79631a7bf40aa1ac531efb744/image.png ADDED Viewed

gradio_cached_examples/39/Generated image/ae08bef2-f889-441a-ba1e-026445bb386a/captions.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"./gradio_cached_examples/39/Generated image/ae08bef2-f889-441a-ba1e-026445bb386a/1a312139177423e79631a7bf40aa1ac531efb744/image.png": null}

gradio_cached_examples/39/Generated image/c048f5e9-7f96-4da7-823d-3a898a4eac92/57a1c5b1ccb262cea6f1ae86fa5e70c89d379a6f/image.png DELETED Viewed

Binary file (572 kB)

gradio_cached_examples/39/Generated image/c048f5e9-7f96-4da7-823d-3a898a4eac92/captions.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"./gradio_cached_examples/39/Generated image/c048f5e9-7f96-4da7-823d-3a898a4eac92/57a1c5b1ccb262cea6f1ae86fa5e70c89d379a6f/image.png": null}

gradio_cached_examples/39/Generated image/d216beac-010e-4466-856c-9d92e471654c/90a51edff815fd0aaef1864d6784583e800be8d8/image.png ADDED Viewed

gradio_cached_examples/39/Generated image/d216beac-010e-4466-856c-9d92e471654c/captions.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"./gradio_cached_examples/39/Generated image/d216beac-010e-4466-856c-9d92e471654c/90a51edff815fd0aaef1864d6784583e800be8d8/image.png": null}

gradio_cached_examples/39/Generated image/e05fc15c-d202-4cb4-b235-6b48d03ef03b/8821c44e2875b2e5fd9d9173c6b6bf6a5267be08/image.png DELETED Viewed

Binary file (580 kB)

gradio_cached_examples/39/Generated image/e05fc15c-d202-4cb4-b235-6b48d03ef03b/captions.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"./gradio_cached_examples/39/Generated image/e05fc15c-d202-4cb4-b235-6b48d03ef03b/8821c44e2875b2e5fd9d9173c6b6bf6a5267be08/image.png": null}

gradio_cached_examples/39/log.csv CHANGED Viewed

@@ -1,6 +1,6 @@
 Generated image,flag,username,timestamp
-./gradio_cached_examples/39/Generated image/e05fc15c-d202-4cb4-b235-6b48d03ef03b,,,2023-06-15 12:05:31.035765
-./gradio_cached_examples/39/Generated image/92613514-ef71-44f5-807d-84a494dedeb1,,,2023-06-15 12:05:36.136151
-./gradio_cached_examples/39/Generated image/0a6be0bd-cbca-430f-b8ea-5ec8a0cf32f4,,,2023-06-15 12:05:43.006787
-./gradio_cached_examples/39/Generated image/5818e92b-be12-44af-a000-022499aab645,,,2023-06-15 12:05:46.365679
-./gradio_cached_examples/39/Generated image/c048f5e9-7f96-4da7-823d-3a898a4eac92,,,2023-06-15 12:05:51.459497

 Generated image,flag,username,timestamp
+./gradio_cached_examples/39/Generated image/ae08bef2-f889-441a-ba1e-026445bb386a,,,2023-06-19 12:19:24.628285
+./gradio_cached_examples/39/Generated image/7dbf49b5-a987-4285-9ecb-899fc0897489,,,2023-06-19 12:19:29.717383
+./gradio_cached_examples/39/Generated image/d216beac-010e-4466-856c-9d92e471654c,,,2023-06-19 12:19:36.564223
+./gradio_cached_examples/39/Generated image/5541f42f-a5c4-4c90-ae9c-389d0f0ea11a,,,2023-06-19 12:19:39.911724
+./gradio_cached_examples/39/Generated image/32ac0e0e-135a-404c-a1ee-53fdbc919db6,,,2023-06-19 12:19:44.983434

gradio_cached_examples/49/Generated image/569b2539-1b09-422e-8f04-28e85cb5ce6b/79b47dee4bf06f02baaddf31631dadf4f0a77b1b/image.png ADDED Viewed

gradio_cached_examples/49/Generated image/569b2539-1b09-422e-8f04-28e85cb5ce6b/captions.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"./gradio_cached_examples/49/Generated image/569b2539-1b09-422e-8f04-28e85cb5ce6b/79b47dee4bf06f02baaddf31631dadf4f0a77b1b/image.png": null}

gradio_cached_examples/49/Generated image/7ca4de19-dacd-433a-9bda-44a30411773a/captions.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"./gradio_cached_examples/49/Generated image/7ca4de19-dacd-433a-9bda-44a30411773a/da41c41cef06d8895f87bd51bccacb9e5ee6fc13/image.png": null}

gradio_cached_examples/49/Generated image/7ca4de19-dacd-433a-9bda-44a30411773a/da41c41cef06d8895f87bd51bccacb9e5ee6fc13/image.png ADDED Viewed

gradio_cached_examples/49/Generated image/9d74cf63-2741-4aa1-9b9d-284ce36b1272/916b46e1b9e7e59a0f42ea2e0e9d3ac2077ddb29/image.png ADDED Viewed

gradio_cached_examples/49/Generated image/9d74cf63-2741-4aa1-9b9d-284ce36b1272/captions.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"./gradio_cached_examples/49/Generated image/9d74cf63-2741-4aa1-9b9d-284ce36b1272/916b46e1b9e7e59a0f42ea2e0e9d3ac2077ddb29/image.png": null}

gradio_cached_examples/49/Generated image/d1cff19c-eda7-411a-97bd-598780ee1514/111213a2bec11fbeb98d5cf421ff3f1e90ac2a6f/image.png ADDED Viewed

gradio_cached_examples/49/Generated image/d1cff19c-eda7-411a-97bd-598780ee1514/captions.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"./gradio_cached_examples/49/Generated image/d1cff19c-eda7-411a-97bd-598780ee1514/111213a2bec11fbeb98d5cf421ff3f1e90ac2a6f/image.png": null}

gradio_cached_examples/49/Generated image/ff249b87-f078-4ed7-b702-d9c026c2ae0b/30ac54337ceb5917e94befaaa6939bdb2970ea50/image.png ADDED Viewed

gradio_cached_examples/49/Generated image/ff249b87-f078-4ed7-b702-d9c026c2ae0b/captions.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"./gradio_cached_examples/49/Generated image/ff249b87-f078-4ed7-b702-d9c026c2ae0b/30ac54337ceb5917e94befaaa6939bdb2970ea50/image.png": null}

gradio_cached_examples/49/log.csv ADDED Viewed

	@@ -0,0 +1,6 @@

+Generated image,flag,username,timestamp
+./gradio_cached_examples/49/Generated image/d1cff19c-eda7-411a-97bd-598780ee1514,,,2023-06-19 12:19:46.344457
+./gradio_cached_examples/49/Generated image/7ca4de19-dacd-433a-9bda-44a30411773a,,,2023-06-19 12:19:47.718673
+./gradio_cached_examples/49/Generated image/569b2539-1b09-422e-8f04-28e85cb5ce6b,,,2023-06-19 12:19:49.113759
+./gradio_cached_examples/49/Generated image/9d74cf63-2741-4aa1-9b9d-284ce36b1272,,,2023-06-19 12:19:50.442599
+./gradio_cached_examples/49/Generated image/ff249b87-f078-4ed7-b702-d9c026c2ae0b,,,2023-06-19 12:19:51.819811

gradio_cached_examples/51/Generated image/52711207-5d80-4eb1-abd1-7ca09ae82f7d/91b5f67cc8cf5b4a8fd2aea741f4175606bbe7b5/image.png DELETED Viewed

Binary file (477 kB)

gradio_cached_examples/51/Generated image/52711207-5d80-4eb1-abd1-7ca09ae82f7d/captions.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"./gradio_cached_examples/51/Generated image/52711207-5d80-4eb1-abd1-7ca09ae82f7d/91b5f67cc8cf5b4a8fd2aea741f4175606bbe7b5/image.png": null}

gradio_cached_examples/51/Generated image/6a5728a0-b580-4114-8c1c-7a3313fcad79/6b704ebfdeabbdcc40397de5d1d12ab6e6c167a6/image.png DELETED Viewed

Binary file (329 kB)

gradio_cached_examples/51/Generated image/6a5728a0-b580-4114-8c1c-7a3313fcad79/captions.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"./gradio_cached_examples/51/Generated image/6a5728a0-b580-4114-8c1c-7a3313fcad79/6b704ebfdeabbdcc40397de5d1d12ab6e6c167a6/image.png": null}

gradio_cached_examples/51/Generated image/8e44d54e-7b4a-46ec-aacc-67ef88a61505/7778fc9077843c4de514cab097cc5ce9d689be7d/image.png DELETED Viewed

Binary file (394 kB)

gradio_cached_examples/51/Generated image/8e44d54e-7b4a-46ec-aacc-67ef88a61505/captions.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"./gradio_cached_examples/51/Generated image/8e44d54e-7b4a-46ec-aacc-67ef88a61505/7778fc9077843c4de514cab097cc5ce9d689be7d/image.png": null}

gradio_cached_examples/51/Generated image/98ce623d-5866-46ae-8e57-4690871fa04f/captions.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"./gradio_cached_examples/51/Generated image/98ce623d-5866-46ae-8e57-4690871fa04f/da7f62f9d0ef44ec431cec8a80a9eabfea6794ab/image.png": null}

gradio_cached_examples/51/Generated image/98ce623d-5866-46ae-8e57-4690871fa04f/da7f62f9d0ef44ec431cec8a80a9eabfea6794ab/image.png DELETED Viewed

Binary file (343 kB)

gradio_cached_examples/51/Generated image/b9462897-294a-42b2-9cd6-89d348b707fc/5981833148e050fc2bbd906d2a91cc66a2782ac0/image.png DELETED Viewed

Binary file (519 kB)

gradio_cached_examples/51/Generated image/b9462897-294a-42b2-9cd6-89d348b707fc/captions.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"./gradio_cached_examples/51/Generated image/b9462897-294a-42b2-9cd6-89d348b707fc/5981833148e050fc2bbd906d2a91cc66a2782ac0/image.png": null}

gradio_cached_examples/51/log.csv DELETED Viewed

@@ -1,6 +0,0 @@
-Generated image,flag,username,timestamp
-./gradio_cached_examples/51/Generated image/52711207-5d80-4eb1-abd1-7ca09ae82f7d,,,2023-06-15 12:05:52.813792
-./gradio_cached_examples/51/Generated image/6a5728a0-b580-4114-8c1c-7a3313fcad79,,,2023-06-15 12:05:54.185722
-./gradio_cached_examples/51/Generated image/98ce623d-5866-46ae-8e57-4690871fa04f,,,2023-06-15 12:05:55.579429
-./gradio_cached_examples/51/Generated image/b9462897-294a-42b2-9cd6-89d348b707fc,,,2023-06-15 12:05:56.908080
-./gradio_cached_examples/51/Generated image/8e44d54e-7b4a-46ec-aacc-67ef88a61505,,,2023-06-15 12:05:58.282893

requirements.txt CHANGED Viewed

@@ -9,4 +9,4 @@ opencv-contrib-python==4.7.0.72
 inflect==6.0.4
 easydict
 accelerate==0.18.0
-gradio==3.34.0

 inflect==6.0.4
 easydict
 accelerate==0.18.0
+gradio==3.35.2