Response Tuning: Aligning Large Language Models without Instruction Paper • 2410.02465 • Published Oct 3 • 12
MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs Paper • 2410.04698 • Published Oct 7 • 13
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References Paper • 2410.05193 • Published about 1 month ago • 12
Collaborative Performance Prediction for Large Language Models Paper • 2407.01300 • Published Jul 1 • 2