Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1012

What should a finetuned model's license be if the model is MIT but the datasets are Apache 2.0 and cc-by-4.0

#866

by rasyosef - opened Jul 31

Discussion

rasyosef

Jul 31

I submitted an instruction tuned version of Microsoft's Phi 1.5 LLM, rasyosef/Phi-1_5-Instruct-v0.1 to the Open LLM Leaderboard last week, but it later disappeared from the Pending Evaluation Queue. Was it removed because of the datasets I used or some other reason?

The base model phi 1.5 has an MIT license but the datasets I used have Apache 2.0 and cc-by-4.0, which of these licenses would the finetuned rasyosef/Phi-1_5-Instruct-v0.1 model inherit?

Supervised Fine-Tuning
- teknium/OpenHermes-2.5 ~ MIT License
Direct Preference Optimization (DPO)
- Used a combination of the following preference datasets
  - HuggingFaceH4/ultrafeedback_binarized ~ MIT License
  - argilla/distilabel-intel-orca-dpo-pairs ~ Apache 2.0 License
  - argilla/distilabel-math-preference-dpo ~ Apache 2.0 License
  - jondurbin/py-dpo-v0.1 ~ cc-by-4.0 License

rasyosef

Jul 31

I found the submission entry in the requests dataset and it shows the status as Failed.

https://huggingface.co/datasets/open-llm-leaderboard/requests/blob/main/rasyosef/Phi-1_5-Instruct-v0.1_eval_request_False_bfloat16_Original.json

{
"model": "rasyosef/Phi-1_5-Instruct-v0.1",
"base_model": "",
"revision": "69307c5e1ab1367aad818118253ddee7578f3c65",
"precision": "bfloat16",
"params": 1.415,
"architectures": "PhiForCausalLM",
"weight_type": "Original",
"status": "FAILED",
"submitted_time": "2024-07-25T01:54:51Z",
"model_type": "\ud83d\udcac : \ud83d\udcac chat models (RLHF, DPO, IFT, ...)",
"job_id": "7816415",
"job_start_time": "2024-07-29T08:27:00.038162",
"use_chat_template": true,
"sender": "rasyosef"
}

If possible I would like to know the reason. I apologize in advance in case the failure was caused by something on my end.

clefourrier

Open LLM Leaderboard org Aug 1

Hi!
Thanks for your issue!

For your model, it failed at loading - it's a bit hard to say if it was a hardware failure or model failure, so I'm passing it to pending again and we'll see how it relaunches.

For the license, hard to say! MIT and Apache are compatible together iirc, but not sure how that plays with a cc by.

rasyosef

Aug 1

Cool, Thanks!

rasyosef

Aug 8

Hey @clefourrier can you please change the revision to the latest one for the rasyosef/Phi-1_5-Instruct-v0.1 submission?

https://huggingface.co/rasyosef/Phi-1_5-Instruct-v0.1

The model is at the top of the queue, but I wanted to avoid any failure due to incomplete information in the model card.

The "base_model" field was not specified in the prevous version of the model card.

Latest Revision:
f4c405ee4bff5dc1a69383f3fe682342c9c87c77

clefourrier

Open LLM Leaderboard org Aug 9

Done, added f4c405ee4bff5dc1a69383f3fe682342c9c87c77 as revision.

I'm closing this issue, feel free to reopen if needed!

clefourrier changed discussion status to closed Aug 9

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment