File size: 2,610 Bytes
c025a31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

---
license: creativeml-openrail-m
base_model: CompVis/stable-diffusion-v1-4
datasets:
- ohicarip/deepfashion_bl2
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
- diffusers
inference: true
---
    
# Text-to-image finetuning - ohicarip/sd-deepfashion-baseline-model

This pipeline was finetuned from **CompVis/stable-diffusion-v1-4** on the **ohicarip/deepfashion_bl2** dataset. Below are some example images generated with the finetuned pipeline using the following prompts: ['This man wears a long-sleeve sweater with pure color patterns. The sweater is with cotton fabric. It has a round neckline. The pants this man wears is of long length. The pants are with denim fabric and solid color patterns. The outer clothing the gentleman wears is with cotton fabric and solid color patterns. There is an accessory on his wrist.', 'This person is wearing a short-sleeve shirt with pure color patterns. The shirt is with cotton fabric. It has a round neckline. This person wears a long trousers. The trousers are with denim fabric and lattice patterns.', 'This guy is wearing a short-sleeve shirt with solid color patterns and a long pants. The shirt is with cotton fabric and its neckline is crew. The pants are with denim fabric and solid color patterns.', 'This female is wearing a tank tank shirt with plaid patterns and a three-point shorts. The tank shirt is with cotton fabric. The neckline of the tank shirt is crew. The shorts are with cotton fabric and plaid patterns. This lady wears socks in shoes.']: 

![val_imgs_grid](./val_imgs_grid.png)


## Pipeline usage

You can use the pipeline like so:

```python
from diffusers import DiffusionPipeline
import torch

pipeline = DiffusionPipeline.from_pretrained("ohicarip/sd-deepfashion-baseline-model", torch_dtype=torch.float16)
prompt = "This man wears a long-sleeve sweater with pure color patterns. The sweater is with cotton fabric. It has a round neckline. The pants this man wears is of long length. The pants are with denim fabric and solid color patterns. The outer clothing the gentleman wears is with cotton fabric and solid color patterns. There is an accessory on his wrist."
image = pipeline(prompt).images[0]
image.save("my_image.png")
```

## Training info

These are the key hyperparameters used during training:

* Epochs: 15
* Learning rate: 1e-05
* Batch size: 8
* Gradient accumulation steps: 4
* Image resolution: 512
* Mixed-precision: fp16


More information on all the CLI arguments and the environment are available on your [`wandb` run page](https://wandb.ai/ohicarip/text2image-fine-tune/runs/6en1otkv).