TheDrummer/Smegmma-9B-v1-GGUF

Thanks. It looks promising and works properly. In fact, it seems that the model has no inhibitions, sometimes he says something, but he still does what he is told. At the same time, it seems to me that creativity has increased.

A small model, but very promising. On my PC I can run it even context_length: 8192 and it still runs entirely in vram and is very fast.

Quickly what I noticed:

usually the second generated text is more interesting than the first one.
the model remembers the facts quite well even in a long story.
the model tends to use short sentences (you can change this in the command, but it quickly returns to short sentences - all gemma 2 that I tested have this... maybe I'm doing something wrong)
the model executes commands quite well, even if it receives more of them, it tries to follow them one by one (if it doesn't, regenerating the response makes it execute them)
nsfw texts are more interesting than the regular version.

Thanks for making this model, I'd love to test it out more.

I've seen versions of the gemma 2 models with f16 outputs praised as "better". But my own tests are strange... these models seem more creative, but also more chaotic. Do you know what it's like with these f16 outputs? Gemma 2 models I have the impression that they work more unreliably with them. And what about Llama? Do you have any experience with this?

TheDrummer
/

Smegmma-9B-v1-GGUF

Thanks!