General discussion.
quantization_options = [
"Q4_K_M", "Q4_K_S", "IQ4_NL", "IQ4_XS", "Q5_K_M",
"Q5_K_S", "Q6_K", "Q8_0", "IQ3_M", "IQ3_S", "IQ3_XS", "IQ3_XXS"
]
This model is MUCH too weighted towards saying things like "being comfortable and ensuring an enjoyable experience for all parties involved" constantly after user mentions anything suggestive at all.
@jeiku @Test157t - I'm assuming it was attempted to remove these kind of "refusals" or emphasis with un-alignment?
I mean, I can pass it through Toxic DPO if you think that would help, but I have not experienced this issue when using a well made card and giving direct orders. Let me know if you'd like me to make you a custom DPO.
Can also attest ive only seen refusals on a completely blank card in chatml, since it falls back to early onset assistant style data.
@jeiku Yeah I think character cards play a role here. But if pushing a new DPO version wouldn't be too much of a hassle, you can go ahead and we can see. For science, @Morktastic , if you can share Character Card details or just general information, ofc?
Experimental quant with the slightly modified data with the RP examples: