01-ai
/

Yi-6B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

cArlIcon commited on Nov 6, 2023

Commit

55eda1d

•

1 Parent(s): a19d790

update README

Files changed (1) hide show

README.md +8 -5

README.md CHANGED Viewed

@@ -13,12 +13,14 @@ license_link: LICENSE
 The **Yi** series models are large language models trained from scratch by
 developers at [01.AI](https://01.ai/). The first public release contains two
-bilingual(English/Chinese) base models with the parameter sizes of 6B and 34B.
-Both of them are trained with 4K sequence length and can be extended to 32K
-during inference time.
 ## News
 - 🎯 **2023/11/02**: The base model of `Yi-6B` and `Yi-34B`.
@@ -36,8 +38,9 @@ during inference time.
 | Aquila-34B    |   67.8   |   71.4   |   63.1   |    -     |    -     |           -            |           -           |      -      |
 | Falcon-180B   |   70.4   |   58.0   |   57.8   |   59.0   |   54.0   |          77.3          |         68.8          |    34.0     |
 | Yi-6B         |   63.2   |   75.5   |   72.0   |   72.2   |   42.8   |          72.3          |         68.7          |    19.8     |
-| **Yi-34B**    | **76.3** | **83.7** | **81.4** | **82.8** | **54.3** |        **80.1**        |       **76.4**        |    37.1     |
 While benchmarking open-source models, we have observed a disparity between the
 results generated by our pipeline and those reported in public sources (e.g.

 The **Yi** series models are large language models trained from scratch by
 developers at [01.AI](https://01.ai/). The first public release contains two
+bilingual(English/Chinese) base models with the parameter sizes of 6B(`Yi-6B`)
+and 34B(`Yi-34B`). Both of them are trained with 4K sequence length and can be
+extended to 32K during inference time. The `Yi-6B-200K` and `Yi-34B-200K` are
+base model with 200K context length.
 ## News
+- 🎯 **2023/11/06**: The base model of `Yi-6B-200K` and `Yi-34B-200K` with 200K context length.
 - 🎯 **2023/11/02**: The base model of `Yi-6B` and `Yi-34B`.
 | Aquila-34B    |   67.8   |   71.4   |   63.1   |    -     |    -     |           -            |           -           |      -      |
 | Falcon-180B   |   70.4   |   58.0   |   57.8   |   59.0   |   54.0   |          77.3          |         68.8          |    34.0     |
 | Yi-6B         |   63.2   |   75.5   |   72.0   |   72.2   |   42.8   |          72.3          |         68.7          |    19.8     |
+| Yi-6B-200K    |   64.0   |   75.3   |   73.5   |   73.9   |   42.0   |          72.0          |         69.1          |    19.0     |
+| **Yi-34B**    | **76.3** | **83.7** |   81.4   |   82.8   | **54.3** |        **80.1**        |         76.4          |    37.1     |
+| Yi-34B-200K   |   76.1   |   83.6   | **81.9** | **83.4** |   52.7   |          79.7          |       **76.6**        |    36.3     |
 While benchmarking open-source models, we have observed a disparity between the
 results generated by our pipeline and those reported in public sources (e.g.