Benchmarks comparing with GLM-130B

#3
by KnutJaegersberg - opened

I wonder if this is the best open Chinese LLM to date. Is it better than GLM-130B?

It appears to beat it MMLU wise

XVERSE Technology org

Thank you for your interest in XVERSE-13B. Given the significant scale difference between XVERSE-13B and GLM-130B, it may not be fair to compare them directly.

yeah, I know the difference, but GLM-130B is roughly old GPT-3 performance. I'm certain that these days a significantly smaller model could beat it, if pretrained with better data and for longer.
The authors of llama claimed their 13b model had GPT 3 performance.

https://www.technology.org/2023/02/27/meta-our-llama-13b-outperforms-openais-gpt-3-despite-being-10x-smaller/

KnutJaegersberg changed discussion status to closed
XVERSE Technology org

Apologies for the delay. At the moment, our primary focus is on comparing models in the 13B range, so we haven't evaluated against GLM-130B. We will conduct this comparison in the future.

Sign up or log in to comment