File size: 3,073 Bytes

cd9bf63
d114d12
 
 
cd9bf63
7ee38f3
42fa051
 
cd9bf63
7ee38f3
f7cec35
1cff19b
0038b20
 
7ee38f3
 
 
0038b20
7ee38f3
5794441
7ee38f3
837b0fd
 
 
0038b20
7ee38f3
5794441
7ee38f3
5794441
 
 
0038b20
7ee38f3
5794441
 
7ee38f3
5794441
 
 
0038b20
7ee38f3
5794441
7ee38f3
 
 
5794441
7ee38f3
837b0fd
 
 
0038b20

---
language:
- zh
- en
license: apache-2.0
datasets:
- wenbopan/Fusang-v1
- wenbopan/OpenOrca-zh-20k
---

![image/webp](https://cdn-uploads.huggingface.co/production/uploads/62cd3a3691d27e60db0698b0/s21sMRxRT56c5t4M15GBP.webp)

# Faro-Yi-9B
Faro-Yi-9B is an improved [Yi-9B-200K](https://huggingface.co/01-ai/Yi-9B-200K) with extensive instruction tuning on [Fusang-V1](https://huggingface.co/datasets/wenbopan/Fusang-v1). Compared to Yi-9B-200K, Faro-Yi-9B has gained greater capability in various downstream tasks and long-context modeling thanks to the large-scale synthetic data in Fusang-V1.

## Performance

Faro-Yi-9B enhances its ability compared to Yi-9B-200K in most dimensions, especially in long-range modeling and bilingual (English, Chinese) understanding. Fi is competitive among all open-sourced models at around 9B parameters. Fi-9B is good at both factual tasks and preferred by LLM-judges.

### Fact-based Evaluation (Open LLM Leaderboard)

| **Metric**     | **MMLU**  | GSM8K     | **HellaSwag** | **TruthfulQA** | **Arc** | **Winogrande** |
| -------------- | --------- | --------- | ------------- | -------------- | ----------- | -------------- |
| **Yi-9B-200K** | 65.73     | 50.49     | 56.72         | 33.80          | 69.25       | 71.67          |
| **Faro-Yi-9B** | **68.80** | **63.08** | **57.28**     | **40.86**      | **72.58**   | 71.11          |

### Long-context Modeling (LongBench)

| **Name**       | **Average_zh** | **Average_en** | **Code Completion** |
|----------------|----------------|----------------|---------------------|
| **Yi-9B-200K** | 30.288         | 36.7071        | 72.2                |
| **Faro-Yi-9B** | **41.092**     | **40.9536**    | 46.0                |

<details>
<summary>Score breakdown</summary>

| **Name**       | **Few-shot Learning_en** | **Synthetic Tasks_en** | **Single-Doc QA_en** | **Multi-Doc QA_en** | **Summarization_en** | **Few-shot Learning_zh** | **Synthetic Tasks_zh** | **Single-Doc QA_zh** | **Multi-Doc QA_zh** | **Summarization_zh** |
|----------------|--------------------------|------------------------|----------------------|---------------------|----------------------|--------------------------|------------------------|----------------------|---------------------|----------------------|
| **Yi-9B-200K** | 60.6                     | 22.8                   | 30.9                 | 38.9                | 25.8                 | 46.5                     | 28.0                   | 49.6                 | 17.7                | 9.7                  |
| **Faro-Yi-9B** | **63.8**                 | **40.2**               | **36.2**             | 38.0                | **26.3**             | 30.0                     | **75.1**               | **55.6**             | **30.7**            | **14.1**             |

</details>

<!--### Performance on Preference TODO-->

### Bilingual Ability (CMMLU & MMLU)

| **Name**       | MMLU      | **CMMLU** |
| -------------- | --------- | --------- |
| **Yi-9B-200K** | 65.73     | 71.97     |
| **Faro-Yi-9B** | **68.80** | **73.28** |