MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures Paper • 2410.13754 • Published 20 days ago • 74
Harnessing Webpage UIs for Text-Rich Visual Understanding Paper • 2410.13824 • Published 20 days ago • 29
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models Paper • 2410.07985 • Published 27 days ago • 26
Data Selection via Optimal Control for Language Models Paper • 2410.07064 • Published 28 days ago • 8
Self-Boosting Large Language Models with Synthetic Preference Data Paper • 2410.06961 • Published 28 days ago • 15