Update README.md
Browse files
README.md
CHANGED
@@ -5,12 +5,12 @@ language:
|
|
5 |
pipeline_tag: text-generation
|
6 |
inference: false
|
7 |
---
|
8 |
-
# Baichuan-13B
|
9 |
|
10 |
<!-- Provide a quick summary of what the model is/does. -->
|
11 |
|
12 |
## 介绍
|
13 |
-
Baichuan-13B 是由百川智能继 [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B) 之后开发的包含 130 亿参数的开源可商用的大规模语言模型,在标准的中文和英文 benchmark上均取得同尺寸最好的效果。本次发布包含有预训练 (Baichuan-13B-Base) 和对齐 (Baichuan-13B-Chat) 两个版本。Baichuan-13B 有如下几个特点:
|
14 |
|
15 |
1. **开源可商用百亿级别中文语言模型**:Baichuan-13B-Base 是免费开源可商用的百亿级别中文预训练语言模型。包含有130亿参数,没有经过任何 Instruction Tuning 或者针对 benchmark 的优化,纯净、高可定制。弥补了在中文领域缺乏 100 亿以上高可用中文预训练大模型的短板。
|
16 |
2. **更大尺寸、更多数据**:在 Baichuan-7B 的基础上进一步扩大参数量到 130 亿,并且在高质量的语料上训练了 1.4 万亿 tokens,是当前开源 13B 尺寸下训练数据量最多的模型。支持中英双语,使用 [ALiBi](https://arxiv.org/abs/2108.12409) 位置编码,上下文窗口长度为 4096。
|
@@ -30,11 +30,11 @@ Baichuan-13B is an open-source, commercially available large-scale language mode
|
|
30 |
|
31 |
## How to Get Started with the Model
|
32 |
|
33 |
-
如下是一个使用Baichuan-13B进行1-shot推理的任务,根据作品给出作者名,正确输出为"夜雨寄北->李商隐"
|
34 |
```python
|
35 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
36 |
|
37 |
-
tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan-13B", trust_remote_code=True)
|
38 |
model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan-13B", device_map="auto", trust_remote_code=True)
|
39 |
inputs = tokenizer('登鹳雀楼->王之涣\n夜雨寄北->', return_tensors='pt')
|
40 |
inputs = inputs.to('cuda:0')
|
@@ -42,11 +42,11 @@ pred = model.generate(**inputs, max_new_tokens=64,repetition_penalty=1.1)
|
|
42 |
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
|
43 |
```
|
44 |
|
45 |
-
The following is a task of performing 1-shot inference using Baichuan-13B, where the author's name is given based on the work, with the correct output being "One Hundred Years of Solitude->Gabriel Garcia Marquez"
|
46 |
```python
|
47 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
48 |
|
49 |
-
tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan-13B", trust_remote_code=True)
|
50 |
model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan-13B", device_map="auto", trust_remote_code=True)
|
51 |
inputs = tokenizer('Hamlet->Shakespeare\nOne Hundred Years of Solitude->', return_tensors='pt')
|
52 |
inputs = inputs.to('cuda:0')
|
|
|
5 |
pipeline_tag: text-generation
|
6 |
inference: false
|
7 |
---
|
8 |
+
# Baichuan-13B-Base
|
9 |
|
10 |
<!-- Provide a quick summary of what the model is/does. -->
|
11 |
|
12 |
## 介绍
|
13 |
+
Baichuan-13B-Base 是由百川智能继 [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B) 之后开发的包含 130 亿参数的开源可商用的大规模语言模型,在标准的中文和英文 benchmark上均取得同尺寸最好的效果。本次发布包含有预训练 (Baichuan-13B-Base) 和对齐 (Baichuan-13B-Chat) 两个版本。Baichuan-13B 有如下几个特点:
|
14 |
|
15 |
1. **开源可商用百亿级别中文语言模型**:Baichuan-13B-Base 是免费开源可商用的百亿级别中文预训练语言模型。包含有130亿参数,没有经过任何 Instruction Tuning 或者针对 benchmark 的优化,纯净、高可定制。弥补了在中文领域缺乏 100 亿以上高可用中文预训练大模型的短板。
|
16 |
2. **更大尺寸、更多数据**:在 Baichuan-7B 的基础上进一步扩大参数量到 130 亿,并且在高质量的语料上训练了 1.4 万亿 tokens,是当前开源 13B 尺寸下训练数据量最多的模型。支持中英双语,使用 [ALiBi](https://arxiv.org/abs/2108.12409) 位置编码,上下文窗口长度为 4096。
|
|
|
30 |
|
31 |
## How to Get Started with the Model
|
32 |
|
33 |
+
如下是一个使用Baichuan-13B-Base进行1-shot推理的任务,根据作品给出作者名,正确输出为"夜雨寄北->李商隐"
|
34 |
```python
|
35 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
36 |
|
37 |
+
tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan-13B-Base", trust_remote_code=True)
|
38 |
model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan-13B", device_map="auto", trust_remote_code=True)
|
39 |
inputs = tokenizer('登鹳雀楼->王之涣\n夜雨寄北->', return_tensors='pt')
|
40 |
inputs = inputs.to('cuda:0')
|
|
|
42 |
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
|
43 |
```
|
44 |
|
45 |
+
The following is a task of performing 1-shot inference using Baichuan-13B-Base, where the author's name is given based on the work, with the correct output being "One Hundred Years of Solitude->Gabriel Garcia Marquez"
|
46 |
```python
|
47 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
48 |
|
49 |
+
tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan-13B-Base", trust_remote_code=True)
|
50 |
model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan-13B", device_map="auto", trust_remote_code=True)
|
51 |
inputs = tokenizer('Hamlet->Shakespeare\nOne Hundred Years of Solitude->', return_tensors='pt')
|
52 |
inputs = inputs.to('cuda:0')
|