405B version
Would be nice if you made 405B version with same data or training etc. I think it would be quite good more than likely
The model looks good. I tried to evaluate it on standard benchmarks, its head to head with llama-3.1-70b model on various benchmarks like BBH, MMLU Pro etc and regressing a bit on couple of datasets. I was expecting numbers bigger than that of llama-3.1-70b. Do you guys have evals numbers on these standard benchmarks so we can compare with those to make sure I am not doing any mistake in my evals. Thanks NVIDIA team.
@kashifmunirai , we don't expect gains over llama-instruct with which we have started on the benchmarks you've mentioned. This model really shines on alignment benchmarks (Arena-Hard, etc.). We'll get https://huggingface.co/spaces/lmarena-ai/chatbot-arena-leaderboard results soon and will post them on the model card.