Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
How is the score calculated in open-llm-leaderboard
#833
by
Amigozyq
- opened
Hi ! Thank you for your outstanding work.
I noticed that the 'resps' calculated in the choice questions of open-llm-leaderboard is like this: "[ [ "-17.890625", "False" ], [ "-26.65625", "False" ], [ "-19.109375", "False" ] ]".
Could you please explain the number of the choices?
Hi!
Thanks for your issue! Without a link to a specific file, it's hard to answer about choice number.
However, for multichoice evaluations, you get a list of, for each choice [aggregated logprobability over the tokens, boolean about the probability corresponding to strictly above 0.5 iirc]
. For all our evals, you can ignore the boolean as they are not used.
clefourrier
changed discussion status to
closed