CheckEval: Robust Evaluation Framework using Large Language Model via Checklist Paper • 2403.18771 • Published Mar 27
RankPrompt: Step-by-Step Comparisons Make Language Models Better Reasoners Paper • 2403.12373 • Published Mar 19