leaderboard-pt-pr-bot commited on
Commit
3d9bc9b
1 Parent(s): 25e43ba

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +22 -2
README.md CHANGED
@@ -1,6 +1,8 @@
1
  ---
 
 
2
  license: mit
3
- library_name: "trl"
4
  tags:
5
  - DPO
6
  - WeniGPT
@@ -8,7 +10,6 @@ base_model: Weni/WeniGPT-2.2.3-Zephyr-7B-LLM_Base_2.0.3_SFT
8
  model-index:
9
  - name: Weni/WeniGPT-2.4.1-Zephyr-7B-3-epochs-LLM_Base_2.0.3_DPO
10
  results: []
11
- language: ['pt']
12
  ---
13
 
14
  # Weni/WeniGPT-2.4.1-Zephyr-7B-3-epochs-LLM_Base_2.0.3_DPO
@@ -84,3 +85,22 @@ The following hyperparameters were used during training:
84
 
85
  ### Hardware
86
  - Cloud provided: runpod.io
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - pt
4
  license: mit
5
+ library_name: trl
6
  tags:
7
  - DPO
8
  - WeniGPT
 
10
  model-index:
11
  - name: Weni/WeniGPT-2.4.1-Zephyr-7B-3-epochs-LLM_Base_2.0.3_DPO
12
  results: []
 
13
  ---
14
 
15
  # Weni/WeniGPT-2.4.1-Zephyr-7B-3-epochs-LLM_Base_2.0.3_DPO
 
85
 
86
  ### Hardware
87
  - Cloud provided: runpod.io
88
+
89
+
90
+ # Open Portuguese LLM Leaderboard Evaluation Results
91
+
92
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/Weni/WeniGPT-2.4.1-Zephyr-7B-3-epochs-GPT-QA-1.0.1_DP_DPO) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
93
+
94
+ | Metric | Value |
95
+ |--------------------------|---------|
96
+ |Average |**61.64**|
97
+ |ENEM Challenge (No Images)| 56.26|
98
+ |BLUEX (No Images) | 47.43|
99
+ |OAB Exams | 38.22|
100
+ |Assin2 RTE | 88.45|
101
+ |Assin2 STS | 68.73|
102
+ |FaQuAD NLI | 61.31|
103
+ |HateBR Binary | 80.71|
104
+ |PT Hate Speech Binary | 66.08|
105
+ |tweetSentBR | 47.53|
106
+