Spaces:
Running
on
Zero
Running
on
Zero
first
Browse files
README.md
CHANGED
@@ -1,158 +1,10 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
<!-- <img src='assets/applications.png'> -->
|
13 |
-
## Release
|
14 |
-
- [2024/03/12] 🔥 Code uploaded.
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
## 🔥 Examples
|
20 |
-
|
21 |
-
<p align="center">
|
22 |
-
<img alt="text" src="assets/demo1.gif" width="45%">
|
23 |
-
|
24 |
-
<img alt="image" src="assets/demo2.gif" width="45%">
|
25 |
-
</p>
|
26 |
-
|
27 |
-
1. **Text-Guided Editing**:Allows users to select an object within an image and replace or refine it based on a text description.
|
28 |
-
- Key features:
|
29 |
-
- Generates more realistic details and smoother transitions than alternative methods
|
30 |
-
- Focuses edits specifically on the targeted object
|
31 |
-
- Preserves unrelated parts of the image
|
32 |
-
|
33 |
-
2. **Image-Guided Editing**: Enables users to choose an object from a reference image and transplant it into another image while preserving its identity.
|
34 |
-
- Key features:
|
35 |
-
- Ensures seamless integration of the object into the new context
|
36 |
-
- Adapts the object's appearance to match the target image's style
|
37 |
-
- Works effectively even when the object's appearance differs significantly between reference and target images
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
<p align="center">
|
42 |
-
<img alt="mask" src="assets/demo3.gif" width="45%">
|
43 |
-
|
44 |
-
<img alt="remove" src="assets/demo4.gif" width="45%">
|
45 |
-
</p>
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
3. **Mask-Based Editing**: Involves manipulating objects by directly editing their masks.
|
50 |
-
- Key features:
|
51 |
-
- Allows for operations like moving, reshaping, resizing, and refining objects
|
52 |
-
- Fills in new details according to the object's associated prompt
|
53 |
-
- Produces natural-looking results that maintain consistency with the overall image
|
54 |
-
|
55 |
-
4. **Item Removal**: Enables users to remove objects from images by deleting the mask-object associations.
|
56 |
-
- Key features:
|
57 |
-
- Intelligently fills in the empty space left by removed objects
|
58 |
-
- Ensures a coherent final image
|
59 |
-
- Maintains the integrity of the surrounding image elements
|
60 |
-
|
61 |
-
## 🔧 Dependencies and Installation
|
62 |
-
- Python >= 3.8 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html))
|
63 |
-
- [PyTorch >= 2.1.0](https://pytorch.org/)
|
64 |
-
```bash
|
65 |
-
conda create --name dedit python=3.10
|
66 |
-
conda activate dedit
|
67 |
-
pip install -U pip
|
68 |
-
|
69 |
-
# Install requirements
|
70 |
-
pip install -r requirements.txt
|
71 |
-
```
|
72 |
-
|
73 |
-
|
74 |
-
## 💻 Run
|
75 |
-
|
76 |
-
### 1. Segmentation
|
77 |
-
Put the image (of any resolution) to be edited into the folder with a specified name, and rename the image as "img.png" or "img.jpg".
|
78 |
-
Then run the segmentation model
|
79 |
-
```
|
80 |
-
sh ./scripts/run_segment.sh
|
81 |
-
```
|
82 |
-
Alternatively, run [GroundedSAM](https://github.com/IDEA-Research/Grounded-Segment-Anything) to detect with text prompt
|
83 |
-
```
|
84 |
-
sh ./scripts/run_segmentSAM.sh
|
85 |
-
```
|
86 |
-
|
87 |
-
Optionally, if segmentation is not good, refine masks with GUI by locally running the mask editing web:
|
88 |
-
```
|
89 |
-
python ui_edit_mask.py
|
90 |
-
```
|
91 |
-
For image-based editing, repeat this step for both reference and target images.
|
92 |
-
|
93 |
-
### 2. Model Finetuning
|
94 |
-
Finetune UNet cross-attention layer of diffusion models by running
|
95 |
-
```
|
96 |
-
sh ./scripts/sdxl/run_ft_sdxl_1024.sh
|
97 |
-
```
|
98 |
-
or finetune full UNet with lora
|
99 |
-
```
|
100 |
-
sh ./scripts/sdxl/run_ft_sdxl_1024_fulllora.sh
|
101 |
-
```
|
102 |
-
If image-based editing is needed, finetune the model with both reference and target images using
|
103 |
-
|
104 |
-
```
|
105 |
-
sh ./scripts/sdxl/run_ft_sdxl_1024_fulllora_2imgs.sh
|
106 |
-
```
|
107 |
-
|
108 |
-
### 3. Edit \!
|
109 |
-
#### 3.1 Reconstruction
|
110 |
-
To see if the original image can be constructed
|
111 |
-
```
|
112 |
-
sh ./scripts/sdxl/run_recon.sh
|
113 |
-
```
|
114 |
-
#### 3.1 Text-based
|
115 |
-
Replace the target item (tgt_index) with the item described by the text prompt (tgt_prompt)
|
116 |
-
```
|
117 |
-
sh ./scripts/sdxl/run_text.sh
|
118 |
-
```
|
119 |
-
#### 3.2 Image-based
|
120 |
-
Replace the target item (tgt_index) in the target image (tgt_name) with the item (src_index) in the reference image
|
121 |
-
```
|
122 |
-
sh ./scripts/sdxl/run_image.sh
|
123 |
-
```
|
124 |
-
#### 3.3 Mask-based
|
125 |
-
For target items (tgt_indices_list), resize it (resize_list), move it (delta_x, delta_y) or reshape it by manually editing the mask shape (using UI).
|
126 |
-
|
127 |
-
The resulting new masks (processed by a simple algorithm) can be visualized in './example1/move_resize/seg_move_resize.png', if it is not reasonable, edit using the UI.
|
128 |
-
|
129 |
-
```
|
130 |
-
sh ./scripts/sdxl/run_move_resize.sh
|
131 |
-
```
|
132 |
-
#### 3.4 Remove
|
133 |
-
Remove the target item (tgt_index), the remaining region will be reassigned to the nearby regions with a simple algorithm.
|
134 |
-
The resulting new masks (processed by a simple algorithm) can be visualized in './example1/remove/seg_removed.png', if it is not reasonable, edit using the UI.
|
135 |
-
|
136 |
-
```
|
137 |
-
sh ./scripts/sdxl/run_move_resize.sh
|
138 |
-
```
|
139 |
-
|
140 |
-
#### 3.4 General editing parameters
|
141 |
-
- We partition the image into three regions as shown below. Regions with the hard mask are frozen, regions with the active mask are generated with diffusion model, and regions with soft mask keep the original content in the first "strength*N" sampling steps.
|
142 |
-
<p align="center">
|
143 |
-
<img src="assets/mask_def.png" height=200>
|
144 |
-
</p>
|
145 |
-
|
146 |
-
- During editing, if you use an edited segmentation that is different from finetuning, add --load_edited_mask; For mask-based and remove, if you edit the masks automatically processed by the algorithm as mentioned, add --load_edited_processed_mask.
|
147 |
-
|
148 |
-
### Cite
|
149 |
-
If you find D-Edit useful for your research and applications, please cite us using this BibTeX:
|
150 |
-
|
151 |
-
```bibtex
|
152 |
-
@article{feng2024dedit,
|
153 |
-
title={An item is Worth a Prompt: Versatile Image Editing with Disentangled Control},
|
154 |
-
author={Aosong Feng, Weikang Qiu, Jinbin Bai, Kaicheng Zhou, Zhen Dong, Xiao Zhang, Rex Ying, and Leandros Tassiulas},
|
155 |
-
journal={arXiv preprint arXiv:2403.04880},
|
156 |
-
year={2024}
|
157 |
-
}
|
158 |
-
```
|
|
|
1 |
+
---
|
2 |
+
title: {{D-Edit}}
|
3 |
+
emoji: {{emoji}}
|
4 |
+
colorFrom: {{colorFrom}}
|
5 |
+
colorTo: {{colorTo}}
|
6 |
+
sdk: {{sdk}}
|
7 |
+
sdk_version: "{{sdkVersion}}"
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|