Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1012

Some suggestions for evaluation priority voting mechanism

#801

by zhiminy - opened Jun 27

Discussion

zhiminy

Jun 27

•

edited Jun 28

Introducing a community-driven voting system to prioritize model evaluations is an innovative approach to managing resource constraints and budgets effectively :) Thanks for your efforts!

However, without a mechanism to periodically increase the priority of less popular models, there is a risk that some models might never be evaluated, especially considering the high volume of daily submissions.

clefourrier

Open LLM Leaderboard org Jun 27

Hi!
Thanks for your interest in the leaderboard!

This is precisely the point, though - we are compute constrained and needed to find a fair way to evaluate first the models most relevant for the community, so some models might indeed be evaluated much later than others if they are less important.
The model dropdown can act as a search bar, so I'm not sure what else you would want, can you specify?

BarraHome

Jun 27

Hi!
Thanks for your interest in the leaderboard!

This is precisely the point, though - we are compute constrained and needed to find a fair way to evaluate first the models most relevant for the community, so some models might indeed be evaluated much later than others if they are less important.

The model dropdown can act as a search bar, so I'm not sure what else you would want, can you specify?

Hey @clefourrier
First of all, thank you for your hard work.

I understand the constraints and the need to prioritize models that are most relevant to the community. I appreciate your efforts to ensure fairness in the evaluation process.

I have a suggestion that might help streamline things: would it be possible to offer a paid option for model evaluations? This way, those who are less concerned with votes or popularity and more eager to get their models evaluated quickly could opt for this route.

It could also help support the resources needed for running the evaluations.

zhiminy

Jun 28

Hi!
Thanks for your interest in the leaderboard!

This is precisely the point, though - we are compute constrained and needed to find a fair way to evaluate first the models most relevant for the community, so some models might indeed be evaluated much later than others if they are less important.

The model dropdown can act as a search bar, so I'm not sure what else you would want, can you specify?

Hey @clefourrier
First of all, thank you for your hard work.

I understand the constraints and the need to prioritize models that are most relevant to the community. I appreciate your efforts to ensure fairness in the evaluation process.

I have a suggestion that might help streamline things: would it be possible to offer a paid option for model evaluations? This way, those who are less concerned with votes or popularity and more eager to get their models evaluated quickly could opt for this route.

It could also help support the resources needed for running the evaluations.

Either highly voted or paid models are prioritized for evaluation, this is quite a brilliant idea tbh.

clefourrier

Open LLM Leaderboard org Jun 28

At the moment, we have no way to do that easily, but we've been thinking about something using user tokens + inference endpoints.
However, to give you an order of magnitude, evaluating a 7B takes at the moment 2 to 3h on 8H100-80G GPUs, for a 70B it's an order of magnitude more. It would be quite a budget ^^

clefourrier changed discussion status to closed Jul 29

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment