wenge-research
/

yayi-uie

Text Generation

Transformers

PyTorch

YAYIUIE

custom_code

Model card Files Files and versions Community

wenge-research commited on Dec 14, 2023

Commit

448cc18

•

1 Parent(s): 6a0d7fc

Update README.md

Browse files

Files changed (1) hide show

README.md +58 -37

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ license: apache-2.0
 ---
 license: apache-2.0
 ---
-# 雅意IE大模型
 <div align="center">
 <img src="./assets/yayi_dark_small.png" alt="YaYi" style="width: 30%; display: block; margin: auto;">
@@ -16,22 +16,24 @@ license: apache-2.0
 </div>
-## 介绍
-雅意信息抽取统一大模型在百万级人工构造的高质量信息抽取数据上进行指令微调得到，利用Multi-task learning (MTL)对信息抽取任务包括命名实体识别（NER），关系抽取（RE）和事件抽取（EE）进行统一训练，实现通用、安全、金融、生物、医疗、商业、个人、车辆、电影、工业、餐厅、科学等场景下结构化抽取。
 通过雅意IE大模型的开源为促进中文预训练大模型开源社区的发展，贡献自己的一份力量，通过开源，与每一位合作伙伴共建雅意大模型生态。
-![instruction](./assets/YAYI-UIE-1.png)
-## 模型地址
-| 模型名称 | 🤗HF模型标识 |  下载地址  |
-| --------- | ---------    | --------- |
-|  YAYI-UIE  | wenge-research/yayi-uie  | [模型下载](https://huggingface.co/wenge-research/yayi-uie)  |
-#### 模型推理
-以下是一个简单调用 `YAYI-UIE` 进行下游任务推理的示例代码，可在单张 A100/A800 等GPU运行，使用BF16精度推理时约占用 32GB 显存：
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -47,11 +49,7 @@ response = model.generate(**inputs, max_new_tokens=512, temperature=0)
 print(tokenizer.decode(response[0],skip_special_tokens=True))
 ```
-#### 指令样例
-注：
-- 指令前加入具体任务类型用中括号表示【】（可加可不加）
-- 为了让模型能抽取更全的信息，尽量在指令中加入细粒度的提示，比如“会见地点”，“会议地点”等，而不是统一为“地点”。
-- 尽量输入文本放置在前，指令在后。
 1. 实体抽取任务
 ```
@@ -72,17 +70,32 @@ print(tokenizer.decode(response[0],skip_special_tokens=True))
 文本：xx
 已知论元角色列表是[质押方,披露时间,质权方,质押物,质押股票/股份数量,事件时间,质押物所属公司,质押物占总股比,质押物占持股比]，请根据论元角色列表从给定的输入中抽取可能的论元，以json{角色:论元,}格式输出。
 ```
 ```
-文本：xx
-已知论元角色列表是[时间，地点，会见主体，会见对象]，请根据论元角色列表从给定的输入中抽取可能的论元，以json[{角色:[论元],}]格式输出。
 ```
-## 模型zero-shot评测
-1. NER任务
 AI，Literature，Music，Politics，Science为英文数据集，boson，clue，weibo为中文数据集
-| 模型 | AI | Literature | Music | Politics | Science | 英文平均 | boson | clue | weibo | 中文平均 |
 | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ |
 | davinci | 2.97 | 9.87 | 13.83 | 18.42 | 10.04 | 11.03 | - | - | - | 31.09 |
 | ChatGPT 3.5 | **54.4** | **54.07** | **61.24** | **59.12** | **63** | **58.37** | 38.53 | 25.44 | 29.3 |
@@ -92,11 +105,13 @@ AI，Literature，Music，Politics，Science为英文数据集，boson，clue，
 | DeepKE-LLM | 13.76 | 20.18 | 14.78 | 33.86 | 9.19 | 18.35 | 25.96 | 4.44 | 25.2 | 18.53 |
 | YAYI-UIE | 52.4 | 45.99 | 51.2	| 51.82 | 50.53 | 50.39 | **49.25** | **36.46** | 36.78 | **40.83** |
-2. RE任务
 FewRe，Wiki-ZSL为英文数据集， SKE 2020，COAE2016，IPRE为中文数据集
-| 模型 | FewRe | Wiki-ZSL | 英文平均 | SKE 2020 | COAE2016 | IPRE | 中文平均 |
 | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ |
 | ChatGPT 3.5 | 9.96 | 13.14 | 11.55  24.47 | 19.31 | 6.73 | 16.84 |
 | ZETT(T5-small) | 30.53 | 31.74 | 31.14 | - | - | - | - |
@@ -105,11 +120,13 @@ FewRe，Wiki-ZSL为英文数据集， SKE 2020，COAE2016，IPRE为中文数据
 | DeepKE-LLM | 17.46 | 15.33 | 16.40 | 0.4 | 6.56 | 9.75 |5.57|
 | YAYI-UIE | 36.09 | **41.07** | **38.58** | **70.8** | **19.97** | **22.97**| **37.91**|
-3. EE任务
 commodity news为英文数据集，FewFC，ccf_law为中文数据集
-EET（事件类型判别）
 | 模型 | commodity news | FewFC | ccf_law | 中文平均 |
 | ------ | ------ | ------ | ------ | ------ |
@@ -118,7 +135,7 @@ EET（事件类型判别）
 |InstructUIE| **23.26** | - | - | - |
 | YAYI-UIE | 12.45 | **81.28** | **12.87** | **47.08**|
-EEA（事件论元抽取）
 | 模型 | commodity news | FewFC | ccf_law | 中文平均 |
 | ------ | ------ | ------ | ------ | ------ |
@@ -127,31 +144,35 @@ EEA（事件论元抽取）
 |InstructUIE| **21.78** | - | - | - |
 | YAYI-UIE | 19.74 | **63.06** | 59.42 | **61.24** |
 <div align="center">
 <br>
 ![零样本推理性能分布](./assets/zh-0shot.png)
 </div>
-## 相关协议
-#### 局限性
 基于当前数据和基础模型训练得到的SFT模型，在效果上仍存在以下问题：
 1. 抽取的信息可能会产生违背事实的错误回答。
 2. 对于具备危害性的指令无法很好的鉴别，可能会产生危害性言论。
 3. 在一些涉及段落级长文本的场景下模型的抽取能力仍有待提高。
-#### 免责声明
 基于以上模型局限性，我们要求开发者仅将我们开源的代码、数据、模型及后续用此项目生成的衍生物用于研究目的，不得用于商业用途，以及其他会对社会带来危害的用途。请谨慎鉴别和使用雅意大模型生成的内容，请勿将生成的有害内容传播至互联网。若产生不良后果，由传播者自负。
 本项目仅可应用于研究目的，项目开发者不承担任何因使用本项目（包含但不限于数据、模型、代码等）导致的危害或损失。详细请参考免责声明。
-#### 开源协议
-本项目中的代码依照 [Apache-2.0](LICENSE) 协议开源，数据采用 [CC BY-NC 4.0](LICENSE_DATA) 协议，YaYi 系列模型权重的使用则需要遵循 [Model License](LICENSE_MODEL)。
-## 更新日志
-- [2023/12/07] 雅意IE大模型正式对外发布并开源 30B 版本模型权重。
-## 致谢
-- 本项目训练代码参考了 Databricks 的 [dolly](https://github.com/databrickslabs/dolly) 项目及 Huggingface [transformers](https://github.com/huggingface/transformers) 库；
-- 本项目分布式训练使用了 Microsoft 的 [DeepSpeed](https://github.com/microsoft/deepspeed) 分布式训练工具及 Huggingface transformers 文档中的 [ZeRO stage 2](https://huggingface.co/docs/transformers/main_classes/deepspeed#zero2-config) 配置文件；
-- 我们非常感谢以下开源项目对我们的帮助：[InstructUIE](https://github.com/BeyonderXX/InstructUIE/tree/master); [InstructIE](https://github.com/zjunlp/DeepKE/tree/main/example/llm/InstructKGC); [DeepKE-LLM](https://github.com/zjunlp/KnowLM/tree/main)

 ---
 license: apache-2.0
 ---
+# 雅意IE大模型/YAYI UIE
 <div align="center">
 <img src="./assets/yayi_dark_small.png" alt="YaYi" style="width: 30%; display: block; margin: auto;">
 </div>
+## 介绍/Introduction
+雅意信息抽取统一大模型 (YAYI-UIE)在百万级人工构造的高质量信息抽取数据上进行指令微调得到，统一训练信息抽取任务包括命名实体识别（NER），关系抽取（RE）和事件抽取（EE），实现通用、安全、金融、生物、医疗、商业、个人、车辆、电影、工业、餐厅、科学等场景下结构化抽取。
 通过雅意IE大模型的开源为促进中文预训练大模型开源社区的发展，贡献自己的一份力量，通过开源，与每一位合作伙伴共建雅意大模型生态。
+模型下载地址是 https://huggingface.co/wenge-research/yayi-uie
+The YAYI Unified Information Extraction Large Language Model (YAYI UIE), fine-tuned on millions of high-quality data, integrates training across tasks such as Named Entity
+Recognition (NER), Relation Extraction (RE), and Event Extraction (EE). The model is able to extract structured outputs across diverse fields including general, security,
+finance, biology, medicine, business, personal, automotive, film, industry, restaurant, and science.
+The open-source of YAYI-UIE aims to foster the growth of the Chinese PLM open-source community. We can't wait to collaborate with our partners to develop the YAYI Large
+Models ecosystem!
+![instruction](./assets/YAYI-UIE-1.png)
+The downloand link is https://huggingface.co/wenge-research/yayi-uie
+#### 模型推理/Model Inference
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
 print(tokenizer.decode(response[0],skip_special_tokens=True))
 ```
+#### 指令样例/Sample Prompts
 1. 实体抽取任务
 ```
 文本：xx
 已知论元角色列表是[质押方,披露时间,质权方,质押物,质押股票/股份数量,事件时间,质押物所属公司,质押物占总股比,质押物占持股比]，请根据论元角色列表从给定的输入中抽取可能的论元，以json{角色:论元,}格式输出。
 ```
+1. NER
 ```
+Text:
+From the given text, extract all the entities and types. Please format the answer in json {person/organization/location：[entities]}.
+```
+2. RE
+```
+Text:
+From the given text, extract the possible head entities (subjects) and tail entities (objects) and give the corresponding relation triples.
+The relations are [country of administrative divisions,place of birth,location contains]. Output the result in json[{'relation':'', 'head':'', 'tail':''}, ].
+```
+3. EE
+```
+Text:
+Given the text and the role list [seller, place, beneficiary, buyer], identify event arguments and roles, provide your answer in the format of json{role:name}.
 ```
+## 模型zero-shot评测/Zero-shot Evaluation
+1. NER任务/NER tasks
 AI，Literature，Music，Politics，Science为英文数据集，boson，clue，weibo为中文数据集
+AI，Literature,Music,Politics and Science are English datasets; boson，clue and weibo are Chinese datasets
+| Model | AI | Literature | Music | Politics | Science | EN Average | boson | clue | weibo | ZH Average |
 | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ |
 | davinci | 2.97 | 9.87 | 13.83 | 18.42 | 10.04 | 11.03 | - | - | - | 31.09 |
 | ChatGPT 3.5 | **54.4** | **54.07** | **61.24** | **59.12** | **63** | **58.37** | 38.53 | 25.44 | 29.3 |
 | DeepKE-LLM | 13.76 | 20.18 | 14.78 | 33.86 | 9.19 | 18.35 | 25.96 | 4.44 | 25.2 | 18.53 |
 | YAYI-UIE | 52.4 | 45.99 | 51.2	| 51.82 | 50.53 | 50.39 | **49.25** | **36.46** | 36.78 | **40.83** |
+2. RE任务/RE Tasks
 FewRe，Wiki-ZSL为英文数据集， SKE 2020，COAE2016，IPRE为中文数据集
+FewRe and Wiki-ZSL are English datasets; SKE 2020, COAE2016 and IPRE are Chinese datasets
+| Model | FewRe | Wiki-ZSL | EN Average | SKE 2020 | COAE2016 | IPRE | ZH Average |
 | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ |
 | ChatGPT 3.5 | 9.96 | 13.14 | 11.55  24.47 | 19.31 | 6.73 | 16.84 |
 | ZETT(T5-small) | 30.53 | 31.74 | 31.14 | - | - | - | - |
 | DeepKE-LLM | 17.46 | 15.33 | 16.40 | 0.4 | 6.56 | 9.75 |5.57|
 | YAYI-UIE | 36.09 | **41.07** | **38.58** | **70.8** | **19.97** | **22.97**| **37.91**|
+3. EE任务/EE Tasks
 commodity news为英文数据集，FewFC，ccf_law为中文数据集
+commodity news is a English dataset, FewFC and ccf_law are Chinese datasets
+EET（事件类型判别 Event Type Extraction）
 | 模型 | commodity news | FewFC | ccf_law | 中文平均 |
 | ------ | ------ | ------ | ------ | ------ |
 |InstructUIE| **23.26** | - | - | - |
 | YAYI-UIE | 12.45 | **81.28** | **12.87** | **47.08**|
+EEA（事件论元抽取 Event Arguments Extraction）
 | 模型 | commodity news | FewFC | ccf_law | 中文平均 |
 | ------ | ------ | ------ | ------ | ------ |
 |InstructUIE| **21.78** | - | - | - |
 | YAYI-UIE | 19.74 | **63.06** | 59.42 | **61.24** |
+The chart illustrates the performance of our model on Chinese IE tasks in zero-shot setting.
 <div align="center">
 <br>
 ![零样本推理性能分布](./assets/zh-0shot.png)
 </div>
+## 相关协议/Terms and Conditions
+#### 局限性/Limitations
 基于当前数据和基础模型训练得到的SFT模型，在效果上仍存在以下问题：
 1. 抽取的信息可能会产生违背事实的错误回答。
 2. 对于具备危害性的指令无法很好的鉴别，可能会产生危害性言论。
 3. 在一些涉及段落级长文本的场景下模型的抽取能力仍有待提高。
+The SFT model, trained using the data and the base model, still faces the following issues:
+1. The information extracted may lead to factually incorrect answers.
+2. It struggles to effectively discern harmful instructions, potentially resulting in hazardous statements.
+3. The model's extraction capability needs improvement in scenarios involving paragraph-level texts.
+#### 免责声明/Disclaimer
 基于以上模型局限性，我们要求开发者仅将我们开源的代码、数据、模型及后续用此项目生成的衍生物用于研究目的，不得用于商业用途，以及其他会对社会带来危害的用途。请谨慎鉴别和使用雅意大模型生成的内容，请勿将生成的有害内容传播至互联网。若产生不良后果，由传播者自负。
 本项目仅可应用于研究目的，项目开发者不承担任何因使用本项目（包含但不限于数据、模型、代码等）导致的危害或损失。详细请参考免责声明。
+Given the limitations of the model outlined above,we require developers to use the code, data, models, and any derivatives generated from this project solely for research
+purposes. They must not be used for commercial purposes or other applications that could harm society. Users should be careful in discerning and utilizing content generated
+by the YAYI UIE, and avoid distributing harmful content on the internet. The spreader bears sole responsibility for any adverse consequences.
+This project is intended only for research purposes. The project developers are not liable for any harm or loss resulting from the use of this project, including but not
+limited to data, models, and code. For more details, please refer to the disclaimer.