Thanks, But how is v0.2 base different from v0.1?

#2
by deleted - opened
deleted

I'm glad Mistral released base v0.2, but from everything I've seen it's basically the identical twin of base v0.1.

Not only are the HF tests scores nearly identical, I ran Instruct v0.2 through the same set of diverse questions I ran community fine-tunes of Mistral base v0.1, and it not only got the same set of test questions right and wrong, but the wording was virtually identical.

Yes, Instruct v0.2 is far better than Instruct v0.1, but that was primarily due to the sub-standard fine-tuning and excessive alignment of Instruct v0.1, which pulled its average score on HF down from v0.1 base's 61 to 55.

Other than a longer context length and not having sliding window I don't see any discernible difference in IQ, knowledge or writing skills between base 0.1 and 0.2. In fact, they even output from the same small pool of words when I ask for 5 words related to astronomy, chemistry...

Unsloth AI org

Did you try testing longer context lengths? Unsure on the details since there is a need for more testing done.

deleted

@shimmyshimmer No, I never tried using longer context lengths. Takes to long to test.

But from everything I read that's the heart of the difference. It's basically the same exact base model except for longer context and no sliding window, and the test score boost of Instruct v0.2 over v0.1 has nothing to do with using a different foundational model. They could have gotten the same Instruct v0.2 test score and performance boost out of the v0.1 base model using the same fine-tuning improvements. I was hoping that base v0.2 came with a boost in IQ, language skills or knowledge.

deleted changed discussion status to closed

Sign up or log in to comment