wenbopan commited on
Commit
837b0fd
1 Parent(s): d8b4f73

Update results

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -20,10 +20,10 @@ Fi-9B enhances its ability compared to Yi-9B-200K in most dimensions, especially
20
 
21
  ### Fact-based Evaluation (Open LLM Leaderboard)
22
 
23
- | **Metric** | **winogrande** | **hellaswag** | **truthfulqa** | **ai2_arc** |
24
- |-----------------|----------------|---------------|----------------|-------------|
25
- | **Yi-9B-200K** | 71.67 | 56.72 | 33.80 | 69.25 |
26
- | **Fi-9B-200K** | 71.11 | **57.28** | **40.86** | **72.58** |
27
 
28
  ### Long-context Modeling (LongBench)
29
 
@@ -46,10 +46,10 @@ Fi-9B enhances its ability compared to Yi-9B-200K in most dimensions, especially
46
 
47
  ### Bilingual Ability (CMMLU & MMLU)
48
 
49
- | **Name** | **CMMLU** |
50
- |----------------|-----------|
51
- | **Yi-9B-200K** | 71.97 |
52
- | **Fi-9B-200K** | 73.28 |
53
 
54
 
55
  ## Current Limitations
 
20
 
21
  ### Fact-based Evaluation (Open LLM Leaderboard)
22
 
23
+ | **Metric** | **MMLU** | GSM8K | **HellaSwag** | **TruthfulQA** | **Arc** | **Winogrande** |
24
+ | -------------- | --------- | --------- | ------------- | -------------- | ----------- | -------------- |
25
+ | **Yi-9B-200K** | 65.73 | 50.49 | 56.72 | 33.80 | 69.25 | 71.67 |
26
+ | **Fi-9B-200K** | **68.80** | **63.08** | **57.28** | **40.86** | **72.58** | 71.11 |
27
 
28
  ### Long-context Modeling (LongBench)
29
 
 
46
 
47
  ### Bilingual Ability (CMMLU & MMLU)
48
 
49
+ | **Name** | MMLU | **CMMLU** |
50
+ | -------------- | --------- | --------- |
51
+ | **Yi-9B-200K** | 65.73 | 71.97 |
52
+ | **Fi-9B-200K** | **68.80** | **73.28** |
53
 
54
 
55
  ## Current Limitations