This model is good, BUT...

#1
by UniversalLove333 - opened

The golden aspect of the model (phenomenal instruction following) Vanished.
I assume, this is because it was trained on the base model of Nemo, and not the instruct variant?

I wonder if you used Alpaca along side Mistrals prompting on the instruct model?
Migtissera, did this with Tess 2.5 on Phi-3-medium-128k (which didn't release with a base model.)
I remember it was really good at stories, WAY better than the original pie, but quickly fell of into being incoherent and bad, (guessing that's because of pie, and not the fine-tune.)

I haven't thoroughly tested this model, while I found it had really good creativity... Its instruction following was awful. (which is what made Nemo sooo good, IME-O.)


You should look into TheSkullery/NeMoria-21b, It's an upscaled version of Nemo, and IME, It's quite a bit smarter and less repetitive than the original.

not to necessarily counter ur experience, but any difference you percieved in nemoria is likely placebo; the merging method they used essentially zeros out the effects of the extra layers, meaning that without finetuning the added layers do about nothing

After fairly extensive testing I agree the model is definitely good, great even for it's size, but it's instruction following has definitely suffered I agree. I find the Instruct model excellent, but understand why you are tuning from the base model here. I think there is a LOT of potential in NeMo.

Sign up or log in to comment