Arnav0400 commited on
Commit
85d21c5
1 Parent(s): 103292f

initial readme

Browse files
Files changed (1) hide show
  1. README.md +29 -3
README.md CHANGED
@@ -1,3 +1,29 @@
1
- ---
2
- license: llama3
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3
3
+ ---
4
+ # 🔹 Key Highlights:
5
+
6
+ - 12% Fewer Parameters: nyun-llama3-62B comprises approximately 12% fewer parameters than the popular Llama-3-70B.
7
+ - Intact Performance: Despite having fewer parameters, our model performs at par if not better, and occasionally outperforms, the Llama-3-70B.
8
+ - No Fine-Tuning Required: This model undergoes no fine-tuning, showcasing the raw potential of our optimization techniques.
9
+
10
+ ## Pipeline and Collaboration
11
+
12
+ For insights into the pipeline and the list of methods used to optimize these models, check out our PruneGPT repository (https://github.com/nyunAI/PruneGPT).
13
+ We invite companies and organizations interested in joining forces with us to release more such open-source variants to reach out at [email protected].
14
+
15
+ ### Model Performance
16
+
17
+ | Dataset | Nyun-Llama3-62B | Meta-Llama3-70B | Meta-Llama2-70B | MBZUAI K2-65B |
18
+ | --- | --- | --- | --- | --- |
19
+ | MMLU (5-shot) | 78.9 | 79.5 | 69.7 | 67.9 |
20
+ | Winogrande (5-shot) | 83.3 | 83.1 | 81.8 | 77.0 |
21
+ | BoolQ (0-shot) | 85.3 | 79.0 | 73.1 | 83.0 |
22
+ | Hellaswag (10-shot) | 85.8 | 88.0 | 86.9 | 85.5 |
23
+ | Arc Challenge (25-shot) | 65.9 | 68.8 | 67.2 | 64.8 |
24
+ | GSM8K (5-shot) | 70.9 | 76.9 | 52.6 | 50.2 |
25
+ | Average | 78.4 | 79.2 | 71.9 | 71.4 |
26
+
27
+
28
+ - **Developed by:** [Nyun AI](https://nyunai.com/)
29
+ - **Repository:** [Github](https://github.com/nyunAI/PruneGPT)