Spaces:
Running
on
CPU Upgrade
Refactor code: Pull leaderboards and models configurations out of the app.py
I've been working on some other version of this leaderboard and I want to share some changes that might be of interest to those looking to fork the project or add new tabs.
This PR basically pulls out the configuration settings for different languages and models metadata of the app.py and put its in two separate configuration files: config.yaml and model_meta.yaml
With this the app.py goes from 2282 lines to 615 lines
I believe that way it's easier to debug, maintain and add new rows by the config.yaml
Additionally, I made modifications to the get_mteb_data function. Previously, this function was looping through the model list downloading all MODEL CARD's every time a new tab was instantiated, which caused the leaderboard take +30 min to initialize in my machine, bbfe97ce caches the MODEL CARD's results while it's initiating, reducing the initialization time to less than 5 min. (The refresh button still works)
You can see the changes working on here: https://huggingface.co/spaces/pt-mteb/mteb_code_refactor (Should have the same interface/results as the current one)
Looks great; From my side we can merge this! @tomaarsen do you have thoughts? 😊
I agree completely, I think these are excellent changes. It's a big step forward in terms of modularity and the caching is very welcome. Great job
@eduagarcia
!
I ran it locally, and I see no further issues.
Do/should we give points for this for MMTEB? cc @KennethEnevoldsen
- Tom Aarsen
@tomaarsen
do give point for this on MTEB, you can just open a PR the the score file called mteb_leaderboard_106.jsonl