Feedback
I noticed some kind of a degradation in the way it writes from 0.3 unstable and this version - it would swap perspective in the wrong way.
For example, the card I am using now is entirely written with "You do X/Y/Z" narration style, and 0.2 maintains this perfectly. 0.3 and 0.5 will go "The man does X/Y/Z" in 3rd person, detaching me from the story as if I am another NPC.
I noticed it doing it a few times, where I think I've never seen this with 0.2 and personally I feel it's a downgrade specifically due to this behavior. The writing style in general felt a bit more de-facto to me versus 0.2 which is my current top favorite model and somehow manages to get personal with me in the right ways and touch the right heart strings :D
I need to do more testing with 0.5, though, altho this perspective shift issue seems to be confirmed from my tests atm, but it should be noted it happened rarely only on a few swipes. Things I noticed more commonly was it that it just felt a tad bit less personal (using same templates as 0.2).
I noticed some kind of a degradation in the way it writes from 0.3 unstable and this version - it would swap perspective in the wrong way.
For example, the card I am using now is entirely written with "You do X/Y/Z" narration style, and 0.2 maintains this perfectly. 0.3 and 0.5 will go "The man does X/Y/Z" in 3rd person, detaching me from the story as if I am another NPC.
I noticed it doing it a few times, where I think I've never seen this with 0.2 and personally I feel it's a downgrade specifically due to this behavior. The writing style in general felt a bit more de-facto to me versus 0.2 which is my current top favorite model and somehow manages to get personal with me in the right ways and touch the right heart strings :D
I need to do more testing with 0.5, though, altho this perspective shift issue seems to be confirmed from my tests atm, but it should be noted it happened rarely only on a few swipes. Things I noticed more commonly was it that it just felt a tad bit less personal (using same templates as 0.2).
Strange your seeing it in both 0.3 and this version considering there both unrelated trains. And i haven't seen this perspective flipping - but will keep it in mind when testing.
Could be a very rare thing, but I don't recall seeing it with 0.2 at all, and last night/today I tried 0.3 and 0.5 and it happened a couple of times in both, opening the sentence with {{user}} in third person.
Maybe it's possible to happen on 0.2 and it's not new. But on a "vibe" level I think 0.2 feels a bit warmer and more personal/expressive. I'll be testing it more, though
Could be a very rare thing, but I don't recall seeing it with 0.2 at all, and last night/today I tried 0.3 and 0.5 and it happened a couple of times in both, opening the sentence with {{user}} in third person.
Maybe it's possible to happen on 0.2 and it's not new. But on a "vibe" level I think 0.2 feels a bit warmer and more personal/expressive. I'll be testing it more, though
That i could see for certain: this model stems from 0.2 originally then/0.4 (which contained only medical/bio/programming data), while 0.3 is based off of 0.1.
Went and did some edge case testing, and ive seen what you mean. It only happened twice in like 500 generations for me. But i think combined with that and the occasional formatting issue this one is going to be renamed.
Its not archived after seeing a larger pool of user feedback, but as with all of my models. Consider everything experimental. (a 0.6 version is up hoping to remedy these issues)
@Nitral-AI This model is noticeably smarter. I know that may sound like an understatement to those who haven't used your models before; however, I am very happy with this model. The creative prose that it responds with are bar none my favorite for (E)RP. And, I am excited that you are working towards gradually making it even more capable as a general purpose large language model. I can't wait to see and enjoy how this model evolves. Great work! ๐
@Nitral-AI This model is noticeably smarter. I know that may sound like an understatement to those who haven't used your models before; however, I am very happy with this model. The creative prose that it responds with are bar none my favorite for (E)RP. And, I am excited that you are working towards gradually making it even more capable as a general purpose large language model. I can't wait to see and enjoy how this model evolves. Great work! ๐
Appreciate the feedback! Will be trying to make overall intelligence and logic the focus for a while since expanding context doesn't seem to be working out in my favor.
I feel like your model could dominate in the (E)RP space and become competitive as a general-purpose LLM. In the near future, perhaps you would consider collaboration opportunities with NeverSleep for enhancing your models RP abilities in exchange for helping them be able to better fine-tune atop LLama-3-8B-Instruct. Your model is the first fine-tune, which I am aware of, of LLaMa-3-8B-Instruct, which successfully retained most of LLama-3-8B-Instructโs intelligent abilities lost fine-tuning. I would love to see you expand on the models intelligence and its capabilities. World-building and in-depth storytelling are crucial aspects of (E)RP. And, intelligence is essential for driving and making use of these creative endeavors, as well as being able to adapt to new situations. I hope you drop a paper soon. We need the information that you have learned through your work. I truly believe the path forward is uncensored AI. Your model did well at trying to recreate the classic game โsnakeโ - better than my gguf versions of LLaMa-3-8B-Instruct. And, I strongly believe that is due to the model not be censored. Please correct me if Iโm wrong, but the future looks bright, from where I am standing - and I am excited about the possibilities. Also: please do reach out to NeverSleep - Iโm sure you guys could work out some kind of arrangement, a good faith โtit for tatโ. Cheers! ๐๐
@Nitral-AI
Build the next version of Hathor atop this: UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3
https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3
This model was developed using Self-Play Preference Optimization at iteration 3, based on the meta-llama/Meta-Llama-3-8B-Instruct architecture as starting point. We utilized the prompt sets from the openbmb/UltraFeedback dataset, splited to 3 parts for 3 iterations by snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset. All responses used are synthetic.
@Nitral-AI Build the next version of Hathor atop this: UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3
https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3This model was developed using Self-Play Preference Optimization at iteration 3, based on the meta-llama/Meta-Llama-3-8B-Instruct architecture as starting point. We utilized the prompt sets from the openbmb/UltraFeedback dataset, splited to 3 parts for 3 iterations by snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset. All responses used are synthetic.
I ended up doing a test merge with 0.5 and this, and i must say. The results were quite interesting - will take this into consideration most certainly for the next time i plan to train from a relative base again.
I feel like your model could dominate in the (E)RP space and become competitive as a general-purpose LLM. In the near future, perhaps you would consider collaboration opportunities with NeverSleep for enhancing your models RP abilities in exchange for helping them be able to better fine-tune atop LLama-3-8B-Instruct. Your model is the first fine-tune, which I am aware of, of LLaMa-3-8B-Instruct, which successfully retained most of LLama-3-8B-Instructโs intelligent abilities lost fine-tuning. I would love to see you expand on the models intelligence and its capabilities. World-building and in-depth storytelling are crucial aspects of (E)RP. And, intelligence is essential for driving and making use of these creative endeavors, as well as being able to adapt to new situations. I hope you drop a paper soon. We need the information that you have learned through your work. I truly believe the path forward is uncensored AI. Your model did well at trying to recreate the classic game โsnakeโ - better than my gguf versions of LLaMa-3-8B-Instruct. And, I strongly believe that is due to the model not be censored. Please correct me if Iโm wrong, but the future looks bright, from where I am standing - and I am excited about the possibilities. Also: please do reach out to NeverSleep - Iโm sure you guys could work out some kind of arrangement, a good faith โtit for tatโ. Cheers! ๐๐
Funny enough, ikari is in chaotic neutrals and we have had talks about this for a while now.
Yeah, he basically said something similar to me: that you guys talk often, and that you have been discussing and deliberating over the idea of a LLM fine-tune collaboration. This is going to be awesome! ๐