Commit
•
83af961
1
Parent(s):
4e0c2b0
Update README.md
Browse files
README.md
CHANGED
@@ -322,7 +322,7 @@ Results from [OpenLLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/o
|
|
322 |
| Zephyr 7B dDPO (HuggingFaceH4/zephyr-7b-beta) | 52.15 | 62.03 | 84.36 | 61.07 | **57.45** | 77.74 | 12.74 | **9.66** |
|
323 |
| argilla/notus-7b-v1 | **52.89** | **64.59** | **84.78** | **63.03** | 54.37 | **79.4** | **15.16** | 8.91 |
|
324 |
|
325 |
-
⚠️
|
326 |
|
327 |
## Training Details
|
328 |
|
|
|
322 |
| Zephyr 7B dDPO (HuggingFaceH4/zephyr-7b-beta) | 52.15 | 62.03 | 84.36 | 61.07 | **57.45** | 77.74 | 12.74 | **9.66** |
|
323 |
| argilla/notus-7b-v1 | **52.89** | **64.59** | **84.78** | **63.03** | 54.37 | **79.4** | **15.16** | 8.91 |
|
324 |
|
325 |
+
⚠️ As pointed out by [AllenAI researchers](https://twitter.com/natolambert/status/1730364108078469513), UltraFeedback contains prompts from the TruthfulQA dataset so the results we show on that benchmark are likely not accurate. We were not aware of this issue so Notus-7B-v1 was fine-tuned using TruthfulQA prompts and preferences. For future releases, we will remove TruthfulQA prompts.
|
326 |
|
327 |
## Training Details
|
328 |
|