Did you just compare 7B 100K model to Claude2-100K?

by Yhyu13 - opened Sep 19, 2023

Sep 19, 2023

https://github.com/lyogavin/Anima/tree/main/anima_100k

Since I only found 7B 100k models, I am assuming evaluation above was done between 7B models and Claude2. I bet Claude2 could be some over 200B MoE monster model to obey the scaling law of emergence. If 7B 100K model is that good, I cannot wait to see what you guys can achieve with bigger models.

If QLoRA a 7B models only requires 800MB, the fined-tuned 70B-100K model should on it's way. I guess?

iphann

Sep 19, 2023

https://github.com/lyogavin/Anima/tree/main/anima_100k

Since I only found 7B 100k models, I am assuming evaluation above was done between 7B models and Claude2. I bet Claude2 could be some over 200B MoE monster model to obey the scaling law of emergence. If 7B 100K model is that good, I cannot wait to see what you guys can achieve with bigger models.

If QLoRA a 7B models only requires 800MB, the fined-tuned 70B-100K model should on it's way. I guess?

Can't wait to check what level a fined-tuned QLoRA 70B-100K model can achieve.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment