Transformers
English
Eval Results
Inference Endpoints
fblgit commited on
Commit
a572096
1 Parent(s): 6498e7a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +196 -0
README.md ADDED
@@ -0,0 +1,196 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ library_name: transformers
6
+ base_model:
7
+ - Qwen/Qwen2.5-32B-Instruct
8
+ datasets:
9
+ - Magpie-Align/Magpie-Pro-300K-Filtered
10
+ model-index:
11
+ - name: TheBeagle-v2beta-32B-MGS
12
+ results:
13
+ - task:
14
+ type: text-generation
15
+ name: Text Generation
16
+ dataset:
17
+ name: IFEval (0-Shot)
18
+ type: HuggingFaceH4/ifeval
19
+ args:
20
+ num_few_shot: 0
21
+ metrics:
22
+ - type: inst_level_strict_acc and prompt_level_strict_acc
23
+ value: 45.03
24
+ name: strict accuracy
25
+ source:
26
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
27
+ name: Open LLM Leaderboard
28
+ - task:
29
+ type: text-generation
30
+ name: Text Generation
31
+ dataset:
32
+ name: BBH (3-Shot)
33
+ type: BBH
34
+ args:
35
+ num_few_shot: 3
36
+ metrics:
37
+ - type: acc_norm
38
+ value: 58.07
39
+ name: normalized accuracy
40
+ source:
41
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
42
+ name: Open LLM Leaderboard
43
+ - task:
44
+ type: text-generation
45
+ name: Text Generation
46
+ dataset:
47
+ name: MATH Lvl 5 (4-Shot)
48
+ type: hendrycks/competition_math
49
+ args:
50
+ num_few_shot: 4
51
+ metrics:
52
+ - type: exact_match
53
+ value: 39.43
54
+ name: exact match
55
+ source:
56
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
57
+ name: Open LLM Leaderboard
58
+ - task:
59
+ type: text-generation
60
+ name: Text Generation
61
+ dataset:
62
+ name: GPQA (0-shot)
63
+ type: Idavidrein/gpqa
64
+ args:
65
+ num_few_shot: 0
66
+ metrics:
67
+ - type: acc_norm
68
+ value: 20.13
69
+ name: acc_norm
70
+ source:
71
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
72
+ name: Open LLM Leaderboard
73
+ - task:
74
+ type: text-generation
75
+ name: Text Generation
76
+ dataset:
77
+ name: MuSR (0-shot)
78
+ type: TAUR-Lab/MuSR
79
+ args:
80
+ num_few_shot: 0
81
+ metrics:
82
+ - type: acc_norm
83
+ value: 24.5
84
+ name: acc_norm
85
+ source:
86
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
87
+ name: Open LLM Leaderboard
88
+ - task:
89
+ type: text-generation
90
+ name: Text Generation
91
+ dataset:
92
+ name: MMLU-PRO (5-shot)
93
+ type: TIGER-Lab/MMLU-Pro
94
+ config: main
95
+ split: test
96
+ args:
97
+ num_few_shot: 5
98
+ metrics:
99
+ - type: acc
100
+ value: 54.57
101
+ name: accuracy
102
+ source:
103
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
104
+ name: Open LLM Leaderboard
105
+ ---
106
+
107
+ # TheBeagle-v2beta-32B-MGS
108
+ This model is an experimental version of our latest innovation: `MGS`. Its up to you to figure out what does it means, but its very explicit.
109
+ We didn't applied our known `UNA` algorithm to the forward pass, but they are entirely compatible and operates in different parts of the neural network and in different ways, tho they both can be seen as a regularization technique.
110
+
111
+
112
+ ## MGS
113
+ MGS stands for... Many-Geeks-Searching... and thats it. Hint: `1+1 is 2, and 1+1 is not 3`
114
+
115
+ We still believe on 1-Epoch should be enough, so we just did 1 Epoch only.
116
+
117
+ ## Dataset
118
+ Used here the first decent (corpora & size) dataset on the hub: `Magpie-Align/Magpie-Pro-300K-Filtered`
119
+ Kudos to the Magpie team to contribute with some decent stuff that I personally think is very good to ablate.
120
+
121
+ It achieves the following results on the evaluation set:
122
+ - Loss: 0.5378 (1 Epoch), outperforming the baseline model.
123
+ ## Quants
124
+
125
+ [All versions available](https://huggingface.co/fblgit/TheBeagle-v2beta-MGS-GGUF/tree/main)
126
+
127
+ ... being uploaded ...
128
+
129
+
130
+ ## Licensing terms:
131
+
132
+ *Quants versions of this model must ONLY be distributed from the author repository, submit a commit/PR and be credited for it*
133
+
134
+ ## Training
135
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
136
+
137
+ ### Training hyperparameters
138
+
139
+ The following hyperparameters were used during training:
140
+ - learning_rate: 8e-05
141
+ - train_batch_size: 2
142
+ - eval_batch_size: 2
143
+ - seed: 42
144
+ - distributed_type: multi-GPU
145
+ - num_devices: 8
146
+ - gradient_accumulation_steps: 4
147
+ - total_train_batch_size: 64
148
+ - total_eval_batch_size: 16
149
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
150
+ - lr_scheduler_type: cosine
151
+ - lr_scheduler_warmup_steps: 25
152
+ - num_epochs: 1
153
+
154
+ ### Training results
155
+
156
+ | Training Loss | Epoch | Step | Validation Loss |
157
+ |:-------------:|:------:|:----:|:---------------:|
158
+ | 9.8642 | 0.0012 | 1 | 0.7195 |
159
+ | 2.077 | 0.0507 | 42 | 0.6161 |
160
+ | 1.0325 | 0.1014 | 84 | 0.6093 |
161
+ | 0.8945 | 0.1520 | 126 | 0.5962 |
162
+ | 0.8532 | 0.2027 | 168 | 0.5869 |
163
+ | 0.8185 | 0.2534 | 210 | 0.5805 |
164
+ | 0.81 | 0.3041 | 252 | 0.5719 |
165
+ | 0.7901 | 0.3548 | 294 | 0.5663 |
166
+ | 0.7766 | 0.4054 | 336 | 0.5618 |
167
+ | 0.7687 | 0.4561 | 378 | 0.5590 |
168
+ | 0.7443 | 0.5068 | 420 | 0.5564 |
169
+ | 0.7494 | 0.5575 | 462 | 0.5525 |
170
+ | 0.7787 | 0.6081 | 504 | 0.5485 |
171
+ | 0.7381 | 0.6588 | 546 | 0.5466 |
172
+ | 0.7359 | 0.7095 | 588 | 0.5444 |
173
+ | 0.7447 | 0.7602 | 630 | 0.5435 |
174
+ | 0.7378 | 0.8109 | 672 | 0.5415 |
175
+ | 0.7302 | 0.8615 | 714 | 0.5398 |
176
+ | 0.7476 | 0.9122 | 756 | 0.5391 |
177
+ | 0.715 | 0.9629 | 798 | 0.5378 |
178
+
179
+
180
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
181
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__TheBeagle-v2beta-32B-MGS)
182
+
183
+ | Metric |Value|
184
+ |-------------------|----:|
185
+ |Avg. |40.29|
186
+ |IFEval (0-Shot) |45.03|
187
+ |BBH (3-Shot) |58.07|
188
+ |MATH Lvl 5 (4-Shot)|39.43|
189
+ |GPQA (0-shot) |20.13|
190
+ |MuSR (0-shot) |24.50|
191
+ |MMLU-PRO (5-shot) |54.57|
192
+
193
+ ## Thanks
194
+ - Qwen Team for their outstanding model
195
+ - MagPie Team for contributing plenty of datasets
196
+ - Cybertron Cloud Compute