Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,73 @@
|
|
1 |
---
|
2 |
license: creativeml-openrail-m
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: creativeml-openrail-m
|
3 |
+
tags:
|
4 |
+
- text-to-image
|
5 |
+
- stable-diffusion
|
6 |
+
- anime
|
7 |
+
- aiart
|
8 |
---
|
9 |
+
|
10 |
+
**This model is trained on 6(+1?) characters from ONIMAI: I'm Now Your Sister! (γε
γ‘γγγ―γγγΎγ!)**
|
11 |
+
|
12 |
+
### Example Generations
|
13 |
+
|
14 |
+
![00009-20230210181727-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00009-20230210181727-min.png)
|
15 |
+
![00041-20230210195115-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00041-20230210195115-min.png)
|
16 |
+
|
17 |
+
### Usage
|
18 |
+
|
19 |
+
The model is shared in both diffuser and safetensors formats.
|
20 |
+
As for the trigger words, the six characters can be prompted with
|
21 |
+
`OyamaMahiro`, `OyamaMihari`, `HozukiKaede`, `HozukiMomiji`, `OkaAsahi`, and `MurosakiMiyo`.
|
22 |
+
`TenkawaNayuta` is tagged but she appears in fewer than 10 images so don't expect any good result.
|
23 |
+
There are also three different styles trained into the model: `aniscreen`, `edstyle`, and `megazine` (yes, typo).
|
24 |
+
As usual you can get multiple-character imagee but starting from 4 it is difficult.
|
25 |
+
|
26 |
+
In the following images are shown the generations of different checkpoints.
|
27 |
+
The default one is that of step 22828, but all the checkpoints starting from step 9969 can be found in the `checkpoints` directory.
|
28 |
+
They are all sufficiently good at the six characters but later ones are better at `megazine` and `edstyle` (at the risk of overfitting, I don't really know).
|
29 |
+
|
30 |
+
![xyz_grid-0000-20230210154700.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0000-20230210154700.jpg)
|
31 |
+
![xyz_grid-0001-20230210155723.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0001-20230210155723.jpg)
|
32 |
+
![xyz_grid-0006-20230210163625.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0006-20230210163625.jpg)
|
33 |
+
|
34 |
+
### More Generations
|
35 |
+
|
36 |
+
![00011-20230210182642-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00011-20230210182642-min.png)
|
37 |
+
![00003-20230210175009-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00003-20230210175009-min.png)
|
38 |
+
![00005-20230210175301-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00005-20230210175301-min.png)
|
39 |
+
![00016-20230210183918-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00016-20230210183918-min.png)
|
40 |
+
![00019-20230210184731-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00019-20230210184731-min.png)
|
41 |
+
![00038-20230210194326-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00038-20230210194326-min.png)
|
42 |
+
![00039-20230210194529-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00039-20230210194529-min.png)
|
43 |
+
![00043-20230210195945-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00043-20230210195945-min.png)
|
44 |
+
![00047-20230210202801-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00047-20230210202801-min.png)
|
45 |
+
|
46 |
+
### Dataset Description
|
47 |
+
|
48 |
+
The dataset is prepared via the workflow detailed here: https://github.com/cyber-meow/anime_screenshot_pipeline
|
49 |
+
|
50 |
+
It contains 21412 images with the following composition
|
51 |
+
|
52 |
+
- 2133 onimai images separated in four types
|
53 |
+
- 1496 anime screenshots from the first six episodes (for style `aniscreen`)
|
54 |
+
- 70 screenshots of the ending of the anime (for style `edstyle`, not counted in the 1496 above)
|
55 |
+
- 528 fan arts (or probably some official arts)
|
56 |
+
- 39 scans of the covers of the mangas (for style `megazine`, don't ask me why I choose this name, it is bad but it turns out to work)
|
57 |
+
- 19279 regularization images which intend to be as various as possible while being in anime style (i.e. no photorealistic image is used)
|
58 |
+
|
59 |
+
Note that the model is trained with a specific weighting scheme to balance between different concepts so that every image does not weight equally.
|
60 |
+
After applying the per-image repeat we get around 145K images per epoch.
|
61 |
+
|
62 |
+
### Training
|
63 |
+
|
64 |
+
Training is done with [EveryDream2](https://github.com/victorchall/EveryDream2trainer) trainer with [ACertainty](https://huggingface.co/JosephusCheung/ACertainty) as base model.
|
65 |
+
The following configuration is used
|
66 |
+
|
67 |
+
- resolution 512
|
68 |
+
- cosine learning rate scheduler, lr 2.5e-6
|
69 |
+
- batch size 8
|
70 |
+
- conditional dropout 0.08
|
71 |
+
- change beta scheduler from `scaler_linear` to `linear` in `config.json` of the scheduler of the model
|
72 |
+
|
73 |
+
I trained for two epochs wheareas the default release model was trained for 22828 steps as mentioned above.
|