Stheno V3.3?
Nymeria has been my go to model recently and i'm interested if v3.3s' increased context would improve nymerias' context expansion 😸
It's 50% princeton's SimPO, so it would require the same 32K PoSE training for SimPO and then merge it with v3.3. I need to check out how the v3.3 behaves, and if by some miracle the quality hasn't plummeted, which I strongly doubt, then maybe it's something to look into.
After a short test and with the same slerp config. I really love how the v3.3 goes more into detail, that aspect feels like an upgrade. But also noticed that it doesn't follow the character card as well and forgets minor details often, which is a biggie. Also leans more on nsfw again, so some rebalancing needed. But first, lets go beyond 8K and see what happens. If it works, great, then I will do what I can to rebalance it. But I'm not going to slap degraded(32k pose trained) SimPO into it. v3.3 has enough problems on it's own.
Also leans more on nsfw again, so some rebalancing needed.
I noticed this too, it even struggles to keep sfw cards sfw when chatting. I'd love a stheno style model without the nsfw samples or less nsfw samples, I feel it could be really good for general rp
Edit - I was going to mess around with training a stheno Lora without nsfw samples but the stheno v3.2 dataset is gone.
Nymeria is the balanced version, doesn't force nsfw. Nymeria-Maid leans more on nsfw and is more submissive.
I was really hoping I could work with v3.3 but it's broken, like every other context scaled model.