TokenBender/llama2-7b-chat-hf-codeCherryPop-qLoRA-merged · Absolutely amazing model! Why isnt it more popular?

Sep 19, 2023

Hey mate, after I have now made some tests and tests, I can now say with certainty that this is one of the absolute top models in the 7B segment - If not the best ever! What fascinates me about this model is that it really seems to be an all-rounder (previously only known from Vicuna). I don't know how you managed this magic, but the model is actually also smarter than other 7B models in my tests. Partially Code-CherryPop scores double in my tests compared to other 7B models. In my tests it is mainly about logic and reasoning. However, even in the subjective assessment of its creative writing ability, this model consistently remains my favorite because it simultaneously pays very good attention to my instruction, gets into roles very well, AND continues to be informative and factual.

The model may make a superficial impression on the user at first because it likes to use starred descriptions and often repeats itself on this aspect. Also, the generous use of emojis might give a new user the impression that the model is crude or something similar. But this appearance is very deceptive.

So to sum it up: I am very grateful for your work and admire your model. I know it's not always better to take the same data set for a larger model, but I'd still be interested to see how a 13B model would behave with your training approach. Is there any prospect of you creating another larger CherryPop model? Or would you please please tell me how to do it myself and what I would need to do it?

Oh yes, as in the headline, I find that this model also gets too little attention at the moment and wonder why it is not much better known? Is there anything I can do to help you "market" the model better (besides writing a Reddit post on LocallLama about it today)? If so, let me know.

yazan

TokenBender

Owner Sep 19, 2023

thanks for your comment :)
Nothing gives me more pleasure than somebody finding my work useful. My original intention to build this model was also to build an overall smart assistant.

It was one of my early works and I found myself maximising my knowledge of the ecosystem and how to build other types of models/datasets after this.
Hence I never followed up with just building bigger versions of the same model.
My current recipe is quite different from my original one but I can go back and do what you are suggesting :)

rreed-pha

Sep 19, 2023

I too found this model impressive!

phi0112358

Sep 19, 2023

You've definitely achieved the goal of building an all-round smart assistant.

Oh that would be very great if you do that! I am already very excited about the results : )