DachengZhang
commited on
Commit
•
cf2561e
1
Parent(s):
9fdce6f
Update README.md
Browse files
README.md
CHANGED
@@ -45,7 +45,9 @@ pipeline_tag: text-generation
|
|
45 |
|
46 |
# Model Introduction
|
47 |
|
48 |
-
- Orion-14B
|
|
|
|
|
49 |
|
50 |
- The Orion-14B series models exhibit the following features:
|
51 |
- Among models with 20B-parameter scale level, Orion-14B-Base model shows outstanding performance in comprehensive evaluations.
|
|
|
45 |
|
46 |
# Model Introduction
|
47 |
|
48 |
+
- Orion-14B-Chat is fine-tuned from Orion-14B-Base using a high-quality corpus of approximately 850,000 entries (only sft), and it also supports Chinese, English, Japanese, and Korean. It performs exceptionally well on the MT-Bench and AlignBench evaluation sets, significantly surpassing other models of the same parameter scale in multiple metrics.
|
49 |
+
|
50 |
+
- The 850,000 fine-tuning corpus comprises two parts: approximately 220,000 manually curated high-quality datasets and 630,000 entries selected and semantically deduplicated from open-source data through model filtering. Among these, the Japanese and Korean data, totaling 70,000 entries, have only undergone basic cleaning and deduplication.
|
51 |
|
52 |
- The Orion-14B series models exhibit the following features:
|
53 |
- Among models with 20B-parameter scale level, Orion-14B-Base model shows outstanding performance in comprehensive evaluations.
|