leaderboard-pt-pr-bot commited on
Commit
69ed565
1 Parent(s): c4d7e8d

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +141 -4
README.md CHANGED
@@ -1,18 +1,139 @@
1
  ---
 
 
 
2
  library_name: peft
3
  tags:
4
  - Gemma
5
  - Portuguese
6
  - Bode
7
  - Alpaca
8
- license: mit
9
- language:
10
- - pt
11
  metrics:
12
  - accuracy
13
  - precision
14
  - f1
15
  - recall
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ---
17
 
18
  # GemBode-2b-it
@@ -110,4 +231,20 @@ Se você deseja utilizar o GemBode-2b-it em sua pesquisa, cite-o da seguinte man
110
  doi = { 10.57967/hf/1879 },
111
  publisher = { Hugging Face }
112
  }
113
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - pt
4
+ license: mit
5
  library_name: peft
6
  tags:
7
  - Gemma
8
  - Portuguese
9
  - Bode
10
  - Alpaca
 
 
 
11
  metrics:
12
  - accuracy
13
  - precision
14
  - f1
15
  - recall
16
+ model-index:
17
+ - name: GemBode-2b-it
18
+ results:
19
+ - task:
20
+ type: text-generation
21
+ name: Text Generation
22
+ dataset:
23
+ name: ENEM Challenge (No Images)
24
+ type: eduagarcia/enem_challenge
25
+ split: train
26
+ args:
27
+ num_few_shot: 3
28
+ metrics:
29
+ - type: acc
30
+ value: 21.62
31
+ name: accuracy
32
+ source:
33
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/GemBode-2b-it
34
+ name: Open Portuguese LLM Leaderboard
35
+ - task:
36
+ type: text-generation
37
+ name: Text Generation
38
+ dataset:
39
+ name: BLUEX (No Images)
40
+ type: eduagarcia-temp/BLUEX_without_images
41
+ split: train
42
+ args:
43
+ num_few_shot: 3
44
+ metrics:
45
+ - type: acc
46
+ value: 25.45
47
+ name: accuracy
48
+ source:
49
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/GemBode-2b-it
50
+ name: Open Portuguese LLM Leaderboard
51
+ - task:
52
+ type: text-generation
53
+ name: Text Generation
54
+ dataset:
55
+ name: OAB Exams
56
+ type: eduagarcia/oab_exams
57
+ split: train
58
+ args:
59
+ num_few_shot: 3
60
+ metrics:
61
+ - type: acc
62
+ value: 27.33
63
+ name: accuracy
64
+ source:
65
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/GemBode-2b-it
66
+ name: Open Portuguese LLM Leaderboard
67
+ - task:
68
+ type: text-generation
69
+ name: Text Generation
70
+ dataset:
71
+ name: Assin2 RTE
72
+ type: assin2
73
+ split: test
74
+ args:
75
+ num_few_shot: 15
76
+ metrics:
77
+ - type: f1_macro
78
+ value: 53.1
79
+ name: f1-macro
80
+ - type: pearson
81
+ value: 15.57
82
+ name: pearson
83
+ source:
84
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/GemBode-2b-it
85
+ name: Open Portuguese LLM Leaderboard
86
+ - task:
87
+ type: text-generation
88
+ name: Text Generation
89
+ dataset:
90
+ name: FaQuAD NLI
91
+ type: ruanchaves/faquad-nli
92
+ split: test
93
+ args:
94
+ num_few_shot: 15
95
+ metrics:
96
+ - type: f1_macro
97
+ value: 53.05
98
+ name: f1-macro
99
+ source:
100
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/GemBode-2b-it
101
+ name: Open Portuguese LLM Leaderboard
102
+ - task:
103
+ type: text-generation
104
+ name: Text Generation
105
+ dataset:
106
+ name: HateBR Binary
107
+ type: eduagarcia/portuguese_benchmark
108
+ split: test
109
+ args:
110
+ num_few_shot: 25
111
+ metrics:
112
+ - type: f1_macro
113
+ value: 66.89
114
+ name: f1-macro
115
+ - type: f1_macro
116
+ value: 24.22
117
+ name: f1-macro
118
+ source:
119
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/GemBode-2b-it
120
+ name: Open Portuguese LLM Leaderboard
121
+ - task:
122
+ type: text-generation
123
+ name: Text Generation
124
+ dataset:
125
+ name: tweetSentBR
126
+ type: eduagarcia-temp/tweetsentbr
127
+ split: test
128
+ args:
129
+ num_few_shot: 25
130
+ metrics:
131
+ - type: f1_macro
132
+ value: 37.47
133
+ name: f1-macro
134
+ source:
135
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/GemBode-2b-it
136
+ name: Open Portuguese LLM Leaderboard
137
  ---
138
 
139
  # GemBode-2b-it
 
231
  doi = { 10.57967/hf/1879 },
232
  publisher = { Hugging Face }
233
  }
234
+ ```
235
+ # [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
236
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/recogna-nlp/GemBode-2b-it)
237
+
238
+ | Metric | Value |
239
+ |--------------------------|---------|
240
+ |Average |**36.08**|
241
+ |ENEM Challenge (No Images)| 21.62|
242
+ |BLUEX (No Images) | 25.45|
243
+ |OAB Exams | 27.33|
244
+ |Assin2 RTE | 53.10|
245
+ |Assin2 STS | 15.57|
246
+ |FaQuAD NLI | 53.05|
247
+ |HateBR Binary | 66.89|
248
+ |PT Hate Speech Binary | 24.22|
249
+ |tweetSentBR | 37.47|
250
+