alea31415
/

onimai-characters

 ---
 license: creativeml-openrail-m
+tags:
+- text-to-image
+- stable-diffusion
+- anime
+- aiart
 ---
+**This model is trained on 6(+1?) characters from ONIMAI: I'm Now Your Sister! (お兄ちゃんはおしまい!)**
+### Example Generations
+![00009-20230210181727-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00009-20230210181727-min.png)
+![00041-20230210195115-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00041-20230210195115-min.png)
+### Usage
+The model is shared in both diffuser and safetensors formats.
+As for the trigger words, the six characters can be prompted with
+`OyamaMahiro`, `OyamaMihari`, `HozukiKaede`, `HozukiMomiji`, `OkaAsahi`, and `MurosakiMiyo`.
+`TenkawaNayuta` is tagged but she appears in fewer than 10 images so don't expect any good result.
+There are also three different styles trained into the model: `aniscreen`, `edstyle`, and `megazine` (yes, typo).
+As usual you can get multiple-character imagee but starting from 4 it is difficult.
+In the following images are shown the generations of different checkpoints.
+The default one is that of step 22828, but all the checkpoints starting from step 9969 can be found in the `checkpoints` directory.
+They are all sufficiently good at the six characters but later ones are better at `megazine` and `edstyle` (at the risk of overfitting, I don't really know).
+![xyz_grid-0000-20230210154700.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0000-20230210154700.jpg)
+![xyz_grid-0001-20230210155723.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0001-20230210155723.jpg)
+![xyz_grid-0006-20230210163625.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0006-20230210163625.jpg)
+### More Generations
+![00011-20230210182642-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00011-20230210182642-min.png)
+![00003-20230210175009-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00003-20230210175009-min.png)
+![00005-20230210175301-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00005-20230210175301-min.png)
+![00016-20230210183918-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00016-20230210183918-min.png)
+![00019-20230210184731-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00019-20230210184731-min.png)
+![00038-20230210194326-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00038-20230210194326-min.png)
+![00039-20230210194529-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00039-20230210194529-min.png)
+![00043-20230210195945-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00043-20230210195945-min.png)
+![00047-20230210202801-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00047-20230210202801-min.png)
+### Dataset Description
+The dataset is prepared via the workflow detailed here: https://github.com/cyber-meow/anime_screenshot_pipeline
+It contains 21412 images with the following composition
+- 2133 onimai images separated in four types
+  - 1496 anime screenshots from the first six episodes (for style `aniscreen`)
+  - 70 screenshots of the ending of the anime (for style `edstyle`, not counted in the 1496 above)
+  - 528 fan arts (or probably some official arts)
+  - 39 scans of the covers of the mangas (for style `megazine`, don't ask me why I choose this name, it is bad but it turns out to work)
+- 19279 regularization images which intend to be as various as possible while being in anime style (i.e. no photorealistic image is used)
+Note that the model is trained with a specific weighting scheme to balance between different concepts so that every image does not weight equally.
+After applying the per-image repeat we get around 145K images per epoch.
+### Training
+Training is done with [EveryDream2](https://github.com/victorchall/EveryDream2trainer) trainer with [ACertainty](https://huggingface.co/JosephusCheung/ACertainty) as base model.
+The following configuration is used
+- resolution 512
+- cosine learning rate scheduler, lr 2.5e-6
+- batch size 8
+- conditional dropout 0.08
+- change beta scheduler from `scaler_linear` to `linear` in `config.json` of the scheduler of the model
+I trained for two epochs wheareas the default release model was trained for 22828 steps as mentioned above.