mlabonne commited on
Commit
0ceb353
1 Parent(s): 3c5baf8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -7
README.md CHANGED
@@ -30,6 +30,8 @@ It is based on a merge of the following models using [LazyMergekit](https://cola
30
 
31
  Special thanks to [Jon Durbin](https://huggingface.co/jondurbin), [Intel](https://huggingface.co/Intel), and [Argilla](https://huggingface.co/argilla) for the preference datasets.
32
 
 
 
33
  ## 🔍 Applications
34
 
35
  This model uses a context window of 8k. I recommend using it with the Mistral Instruct chat template (works perfectly with LM Studio).
@@ -44,7 +46,7 @@ It is one of the very best 7B models in terms of instructing following and reaso
44
 
45
  ### Nous
46
 
47
- The evaluation was performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval) on Nous suite. See the entire leaderboard [here](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard).
48
 
49
  | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
50
  |---|---:|---:|---:|---:|---:|
@@ -60,9 +62,9 @@ The evaluation was performed using [LLM AutoEval](https://github.com/mlabonne/ll
60
 
61
  ### EQ-bench
62
 
63
- AlphaMonarch-7B is the second best-performing 7B model on [EQ-bench](https://eqbench.com/) by Samuel J. Peach.
64
-
65
 
 
66
 
67
  ### MT-Bench
68
 
@@ -71,32 +73,34 @@ AlphaMonarch-7B is the second best-performing 7B model on [EQ-bench](https://eqb
71
  score
72
  model turn
73
  gpt-4 1 8.95625
74
- OmniBeagle-7B 1 8.32500
75
  AlphaMonarch-7B 1 8.23750
76
  claude-v1 1 8.15000
 
77
  gpt-3.5-turbo 1 8.07500
78
  claude-instant-v1 1 7.80000
79
 
80
-
81
  ########## Second turn ##########
82
  score
83
  model turn
84
  gpt-4 2 9.025000
85
  claude-instant-v1 2 8.012658
 
86
  gpt-3.5-turbo 2 7.812500
87
  claude-v1 2 7.650000
88
  AlphaMonarch-7B 2 7.618750
89
- OmniBeagle-7B 2 7.587500
90
 
91
  ########## Average ##########
92
  score
93
  model
94
  gpt-4 8.990625
95
- OmniBeagle-7B 7.956250
96
  gpt-3.5-turbo 7.943750
97
  AlphaMonarch-7B 7.928125
98
  claude-instant-v1 7.905660
99
  claude-v1 7.900000
 
100
  NeuralBeagle14-7B 7.628125
101
  ```
102
 
 
30
 
31
  Special thanks to [Jon Durbin](https://huggingface.co/jondurbin), [Intel](https://huggingface.co/Intel), and [Argilla](https://huggingface.co/argilla) for the preference datasets.
32
 
33
+ **Try the demo**: https://huggingface.co/spaces/mlabonne/AlphaMonarch-7B-GGUF-Chat
34
+
35
  ## 🔍 Applications
36
 
37
  This model uses a context window of 8k. I recommend using it with the Mistral Instruct chat template (works perfectly with LM Studio).
 
46
 
47
  ### Nous
48
 
49
+ AlphaMonarch-7B is the best-performing 7B model on Nous' benchmark suite (evaluation performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval)). See the entire leaderboard [here](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard).
50
 
51
  | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
52
  |---|---:|---:|---:|---:|---:|
 
62
 
63
  ### EQ-bench
64
 
65
+ AlphaMonarch-7B is also outperforming 70B and 120B parameter models on [EQ-bench](https://eqbench.com/) by [Samuel J. Paech](https://twitter.com/sam_paech), who kindly ran the evaluations.
 
66
 
67
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/dnCFxieqLiAC3Ll6CfdZW.png)
68
 
69
  ### MT-Bench
70
 
 
73
  score
74
  model turn
75
  gpt-4 1 8.95625
76
+ OmniBeagle-7B 1 8.31250
77
  AlphaMonarch-7B 1 8.23750
78
  claude-v1 1 8.15000
79
+ NeuralMonarch-7B 1 8.09375
80
  gpt-3.5-turbo 1 8.07500
81
  claude-instant-v1 1 7.80000
82
 
 
83
  ########## Second turn ##########
84
  score
85
  model turn
86
  gpt-4 2 9.025000
87
  claude-instant-v1 2 8.012658
88
+ OmniBeagle-7B 2 7.837500
89
  gpt-3.5-turbo 2 7.812500
90
  claude-v1 2 7.650000
91
  AlphaMonarch-7B 2 7.618750
92
+ NeuralMonarch-7B 2 7.375000
93
 
94
  ########## Average ##########
95
  score
96
  model
97
  gpt-4 8.990625
98
+ OmniBeagle-7B 8.075000
99
  gpt-3.5-turbo 7.943750
100
  AlphaMonarch-7B 7.928125
101
  claude-instant-v1 7.905660
102
  claude-v1 7.900000
103
+ NeuralMonarch-7B 7.734375
104
  NeuralBeagle14-7B 7.628125
105
  ```
106