wenbopan
/

Faro-Yi-9B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Faro-Yi-9B / README.md

wenbopan's picture

rm metadata

9b0eaf4 verified 8 months ago

|

3.37 kB

	---
	license: apache-2.0
	datasets:
	- wenbopan/RefGPT-Fact-v2-8x
	- wenbopan/anti-haystack
	- wenbopan/OpenHermes-2.5-zh
	language:
	- zh
	- en
	---

	![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/62cd3a3691d27e60db0698b0/2peGbPRq4jE-OoS9ndkOx.jpeg)

	# Fi-9B

	Fi-9B is an improved [Yi-9B-200K](https://huggingface.co/01-ai/Yi-9B-200K) with extensive instruction tuning on [Fusang-V1](https://huggingface.co/datasets/wenbopan/Fusang-v1). Compared to Yi-9B-200K, Fi-9B has gained greater capability in various downstream tasks and long-context modeling thanks to the large-scale synthetic data in Fusang-V1.

	## Performance

	Fi-9B enhances its ability compared to Yi-9B-200K in most dimensions, especially in long-range modeling and bilingual (English, Chinese) understanding. Fi is competitive among all open-sourced models at around 9B parameters. Fi-9B is good at both factual tasks and preferred by LLM-judges.

	### Fact-based Evaluation (Open LLM Leaderboard)

	\| Metric \| MMLU \| GSM8K \| HellaSwag \| TruthfulQA \| Arc \| Winogrande \|
	\| -------------- \| --------- \| --------- \| ------------- \| -------------- \| ----------- \| -------------- \|
	\| Yi-9B-200K \| 65.73 \| 50.49 \| 56.72 \| 33.80 \| 69.25 \| 71.67 \|
	\| Fi-9B-200K \| 68.80 \| 63.08 \| 57.28 \| 40.86 \| 72.58 \| 71.11 \|

	### Long-context Modeling (LongBench)

	\| Name \| Average_zh \| Average_en \| Code Completion \|
	\|----------------\|----------------\|----------------\|---------------------\|
	\| Yi-9B-200K \| 30.288 \| 36.7071 \| 72.2 \|
	\| Fi-9B-200K \| 41.092 \| 40.9536 \| 46.0 \|

	<details>
	<summary>Score breakdown</summary>

	\| Name \| Few-shot Learning_en \| Synthetic Tasks_en \| Single-Doc QA_en \| Multi-Doc QA_en \| Summarization_en \| Few-shot Learning_zh \| Synthetic Tasks_zh \| Single-Doc QA_zh \| Multi-Doc QA_zh \| Summarization_zh \|
	\|----------------\|--------------------------\|------------------------\|----------------------\|---------------------\|----------------------\|--------------------------\|------------------------\|----------------------\|---------------------\|----------------------\|
	\| Yi-9B-200K \| 60.6 \| 22.8 \| 30.9 \| 38.9 \| 25.8 \| 46.5 \| 28.0 \| 49.6 \| 17.7 \| 9.7 \|
	\| Fi-9B-200K \| 63.8 \| 40.2 \| 36.2 \| 38.0 \| 26.3 \| 30.0 \| 75.1 \| 55.6 \| 30.7 \| 14.1 \|

	</details>

	<!--### Performance on Preference TODO-->

	### Bilingual Ability (CMMLU & MMLU)

	\| Name \| MMLU \| CMMLU \|
	\| -------------- \| --------- \| --------- \|
	\| Yi-9B-200K \| 65.73 \| 71.97 \|
	\| Fi-9B-200K \| 68.80 \| 73.28 \|


	## Current Limitations

	- This version of Fi-9B may not be able to stop generation in some scenarios. I will fix that soon.
	- Compared to the original Yi-9B-200K, Fi-9B has degraded ability for code completion. This may be due to the lack of raw code data during instruction tuning.