ashllay commited on
Commit
571c7e8
1 Parent(s): b56afd3

Update README.md

Browse files

Updated download links and changed github link to archived source.

Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -39,14 +39,14 @@ For more information about how Stable Diffusion functions, please have a look at
39
  The **Stable-Diffusion-v1-5** checkpoint was initialized with the weights of the [Stable-Diffusion-v1-2](https:/steps/huggingface.co/CompVis/stable-diffusion-v1-2)
40
  checkpoint and subsequently fine-tuned on 595k steps at resolution 512x512 on "laion-aesthetics v2 5+" and 10% dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
41
 
42
- You can use this both with the [🧨Diffusers library](https://github.com/huggingface/diffusers) and the [RunwayML GitHub repository](https://github.com/runwayml/stable-diffusion).
43
 
44
  ### Diffusers
45
  ```py
46
  from diffusers import StableDiffusionPipeline
47
  import torch
48
 
49
- model_id = "runwayml/stable-diffusion-v1-5"
50
  pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
51
  pipe = pipe.to("cuda")
52
 
@@ -60,10 +60,10 @@ For more detailed instructions, use-cases and examples in JAX follow the instruc
60
  ### Original GitHub Repository
61
 
62
  1. Download the weights
63
- - [v1-5-pruned-emaonly.ckpt](https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt) - 4.27GB, ema-only weight. uses less VRAM - suitable for inference
64
- - [v1-5-pruned.ckpt](https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned.ckpt) - 7.7GB, ema+non-ema weights. uses more VRAM - suitable for fine-tuning
65
 
66
- 2. Follow instructions [here](https://github.com/runwayml/stable-diffusion).
67
 
68
  ## Model Details
69
  - **Developed by:** Robin Rombach, Patrick Esser
@@ -176,8 +176,8 @@ Currently six Stable Diffusion checkpoints are provided, which were trained as f
176
  filtered to images with an original size `>= 512x512`, estimated aesthetics score `> 5.0`, and an estimated watermark probability `< 0.5`. The watermark estimate is from the LAION-5B metadata, the aesthetics score is estimated using an [improved aesthetics estimator](https://github.com/christophschuhmann/improved-aesthetic-predictor)).
177
  - [`stable-diffusion-v1-3`](https://huggingface.co/CompVis/stable-diffusion-v1-3): Resumed from `stable-diffusion-v1-2` - 195,000 steps at resolution `512x512` on "laion-improved-aesthetics" and 10 % dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
178
  - [`stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) Resumed from `stable-diffusion-v1-2` - 225,000 steps at resolution `512x512` on "laion-aesthetics v2 5+" and 10 % dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
179
- - [`stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) Resumed from `stable-diffusion-v1-2` - 595,000 steps at resolution `512x512` on "laion-aesthetics v2 5+" and 10 % dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
180
- - [`stable-diffusion-inpainting`](https://huggingface.co/runwayml/stable-diffusion-inpainting) Resumed from `stable-diffusion-v1-5` - then 440,000 steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning. For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything.
181
 
182
  - **Hardware:** 32 x 8 x A100 GPUs
183
  - **Optimizer:** AdamW
 
39
  The **Stable-Diffusion-v1-5** checkpoint was initialized with the weights of the [Stable-Diffusion-v1-2](https:/steps/huggingface.co/CompVis/stable-diffusion-v1-2)
40
  checkpoint and subsequently fine-tuned on 595k steps at resolution 512x512 on "laion-aesthetics v2 5+" and 10% dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
41
 
42
+ You can use this both with the [🧨Diffusers library](https://github.com/huggingface/diffusers) and the [Archive of RunwayML GitHub repository](https://github.com/ashllay/stable-diffusion-archive).
43
 
44
  ### Diffusers
45
  ```py
46
  from diffusers import StableDiffusionPipeline
47
  import torch
48
 
49
+ model_id = "ashllay/stable-diffusion-v1-5"
50
  pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
51
  pipe = pipe.to("cuda")
52
 
 
60
  ### Original GitHub Repository
61
 
62
  1. Download the weights
63
+ - [v1-5-pruned-emaonly.ckpt](https://huggingface.co/ashllay/stable-diffusion-v1-5-archive/resolve/main/v1-5-pruned-emaonly.ckpt) - 4.27GB, ema-only weight. uses less VRAM - suitable for inference
64
+ - [v1-5-pruned.ckpt](https://huggingface.co/ashllay/stable-diffusion-v1-5-archive/resolve/main/v1-5-pruned.ckpt) - 7.7GB, ema+non-ema weights. uses more VRAM - suitable for fine-tuning
65
 
66
+ 2. Follow instructions [here](https://github.com/ashllay/stable-diffusion-archive).
67
 
68
  ## Model Details
69
  - **Developed by:** Robin Rombach, Patrick Esser
 
176
  filtered to images with an original size `>= 512x512`, estimated aesthetics score `> 5.0`, and an estimated watermark probability `< 0.5`. The watermark estimate is from the LAION-5B metadata, the aesthetics score is estimated using an [improved aesthetics estimator](https://github.com/christophschuhmann/improved-aesthetic-predictor)).
177
  - [`stable-diffusion-v1-3`](https://huggingface.co/CompVis/stable-diffusion-v1-3): Resumed from `stable-diffusion-v1-2` - 195,000 steps at resolution `512x512` on "laion-improved-aesthetics" and 10 % dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
178
  - [`stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) Resumed from `stable-diffusion-v1-2` - 225,000 steps at resolution `512x512` on "laion-aesthetics v2 5+" and 10 % dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
179
+ - [`stable-diffusion-v1-5`](https://huggingface.co/ashllay/stable-diffusion-v1-5-archive) Resumed from `stable-diffusion-v1-2` - 595,000 steps at resolution `512x512` on "laion-aesthetics v2 5+" and 10 % dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
180
+ - [`stable-diffusion-inpainting`](https://huggingface.co/ashllay/stable-diffusion-v1-5-inpainting-archive) Resumed from `stable-diffusion-v1-5` - then 440,000 steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning. For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything.
181
 
182
  - **Hardware:** 32 x 8 x A100 GPUs
183
  - **Optimizer:** AdamW