Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Leaderboard is of very limited use without more 0-shot, instruction prompted datasets
#27
by
JulesGM
- opened
Most of the use of LLM nowadays is with zero shot & prompting, yet there is just one fairly specific dataset evaluating this.
I think it would be important to add more zero-shotted, instruction prompted datasets as this is how the models will be used a large fraction of the time.
Hi! We tried to select a good range of evaluation tasks based on what is used in the litterature to compare models :)
We might add more 0-shot evaluations in the future!
clefourrier
changed discussion status to
closed