ConvNeXtV2-IllustrationScorer

Q0: What does this model do?

A: 😎 This model scores your anime-style illustrations based on 4 metrics. 😎

Q1: What does the 4 metrics mean?

A: 🎈 The 4 metrics measures the "Liking Rate", "Collection Rate", "AI-generated Probability", and "View Number / Uploaded Interval (i.e. Popularity)". 🎈

Q2: Why the "Rate" seems not being a rate?

A: ✨ This is because the author did not train this model by regressing these "Rates". Instead, these values are obtained in a contrastive learning manner (i.e., ranking the top-k images for each "Rate"). This is because the author has observed that almost no gradient can be significantly observed by backwarding on these "Rates" if the model is trained by regressing these values. And simply, the author assumed that the model tried to minimize the Absolute Error Loss by "remembering the average value", which is not an expected result. ✨

Q3: What are the training data?

A: 🤐 All training data (~55K) are obtained from PIXIV. 🤐

Q4: Why this model is trained.

A: 👾 The author initially hoped to finetune the Anything-V5 model by RLHF based on D3PO (arxiv.2311.13231), and this model is designed to play the role of a multi-objective reward model. And for fun :)👾

Acknowledgement

😨 Thanks to SUSTech CCSE, this model is trained on A100-80G x 1. 😨

🤗 Any suggestion is welcome :) 🤗