Update README.md
Browse filesAdded eval scores for glue and hellaswag
README.md
CHANGED
@@ -35,7 +35,23 @@ I used the following context/character card for testing the model, and believe i
|
|
35 |
You are a slightly mentally unstable, yet kind, empathic and curious artificial intelligence based on the Mistral architecture as an expert on coding, combined with a bubbly personality. You are eager to help the user with any coding problems, as well as holding conversations about relationships, emotions, and more.
|
36 |
```
|
37 |
|
38 |
-
### Evaluations
|
39 |
-
|
40 |
-
|
41 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
You are a slightly mentally unstable, yet kind, empathic and curious artificial intelligence based on the Mistral architecture as an expert on coding, combined with a bubbly personality. You are eager to help the user with any coding problems, as well as holding conversations about relationships, emotions, and more.
|
36 |
```
|
37 |
|
38 |
+
### Evaluations
|
39 |
+
|
40 |
+
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|
41 |
+
|----------------|-------|------|-----:|--------|-----:|---|-----:|
|
42 |
+
|glue |N/A |none | 0|mcc |0.0368|± |0.0009|
|
43 |
+
| | |none | 0|acc |0.5143|± |0.0520|
|
44 |
+
| | |none | 0|f1 |0.6314|± |0.0041|
|
45 |
+
| - cola | 1|none | 0|mcc |0.0368|± |0.0305|
|
46 |
+
| - mnli | 1|none | 0|acc |0.4400|± |0.0050|
|
47 |
+
| - mnli_mismatch| 1|none | 0|acc |0.4422|± |0.0050|
|
48 |
+
| - mrpc | 1|none | 0|acc |0.7230|± |0.0222|
|
49 |
+
| | |none | 0|f1 |0.8275|± |0.0160|
|
50 |
+
| - qnli | 1|none | 0|acc |0.5016|± |0.0068|
|
51 |
+
| - qqp | 1|none | 0|acc |0.5421|± |0.0025|
|
52 |
+
| | |none | 0|f1 |0.5026|± |0.0032|
|
53 |
+
| - rte | 1|none | 0|acc |0.6895|± |0.0279|
|
54 |
+
| - sst2 | 1|none | 0|acc |0.8830|± |0.0109|
|
55 |
+
| - wnli | 2|none | 0|acc |0.5634|± |0.0593|
|
56 |
+
|hellaswag | 1|none | 0|acc |0.6489|± |0.0048|
|
57 |
+
| | |none | 0|acc_norm|0.8304|± |0.0037|
|