Suggested Params for generation?
Okay you guys the base default settings in Stable Swarm UI and using the new CosXL initially gives um...well...yeah prompt: beautiful ginger man, HDR, 8k, ISO100, F1/16, Shutter Speed 1/1000,
The model card is very sparse about which samplers, CFG, number of steps to use for generation. After some testing I can say that selecting a 9x16 image in StableSwarmUI, CFG 1.5, 50 steps, DPM2 A, produces much much better results. same prompt as before.
So it can do photorealism despite what other threads show.
I don't know about Comfy, in Diffusers I'm using
pipe.scheduler = EDMEulerScheduler(sigma_min=0.002, sigma_max=120.0, sigma_data=1.0, prediction_type="v_prediction")
CFG = 8 , steps 30 to 50 but I've not tried tuning the number of steps.
Oh, thats interesting, that gets garbage results with your prompt, but the other ones I've run have been fine.
Using a Karras scheduler with max_sigma ~120 is also a good idea. This model can go up to max_sigma 999 comfortably, feel free to experiment.
Most DPMPP samplers should work well iirc.
Please note this model was not tuned for aesthetics at all. The model was simply an experiment we decided to release for interested researchers.
Using a Karras scheduler with max_sigma ~120 is also a good idea. This model can go up to max_sigma 999 comfortably, feel free to experiment.
Most DPMPP samplers should work well iirc.
Please note this model was not tuned for aesthetics at all. The model was simply an experiment we decided to release for interested researchers.
See that's the kind of info you should put on the model card on the hugging face website and github repos. How am I supposed to research if you don't tell us what's new in the release and what to settings to play around with? Where was the paper behind the technique attached to this models release? No link in the model card again find out. I'm not trying to be confrontational; I'm trying to be helpful in ensuring that others don't run into the same confusion I did. This was one of the shorter model cards y'all have have ever released in comparison to others. A bit more info in some prominent places would let researchers/users know what's what.
its not that the model isnt aesthetic. it isnt even functional like a typical v-prediction model. did you train it using max grad norm set to 0.3? seriously how does this even happen?