winninghealth
/

WiNGPT2-Gemma-2-9B-Base

+---
+license: apache-2.0
+language:
+- en
+- zh
+tags:
+- medical
+---
+## WiNGPT2
+[WiNGPT](https://github.com/winninghealth/WiNGPT2) 是一个基于GPT的医疗垂直领域大模型，旨在将专业的医学知识、医疗信息、数据融会贯通，为医疗行业提供智能化的医疗问答、诊断支持和医学知识等信息服务，提高诊疗效率和医疗服务质量。
+## 更新日志
+[2024/08/15] 更新 **WiNGPT2-Gemma-2-9B-Base** 和 **WiNGPT2-Gemma-2-9B-Chat** 模型（中文医疗能力**提升超过13%**/多语言）与测评结果（WiNEval-2.0）
+[2024/04/24] 更新 WiNGPT2-Llama-3-8B-Chat-AWQ，WiNGPT2-Llama-3-8B-Chat-GGUF 量化模型
+[2024/04/23] 更新 WiNGPT2-Llama-3-8B-Base 和 WiNGPT2-Llama-3-8B-Chat 模型（中文增强/多语言）与测评结果
+[2024/04/01] 更新 WiNEval 测评结果
+[2024/03/05] 开源7B/14B-Chat-4bit模型权重: [🤗](https://huggingface.co/winninghealth/WiNGPT2-7B-Chat-AWQ)WiNGPT2-7B-Chat-4bit和[🤗](https://huggingface.co/winninghealth/WiNGPT2-14B-Chat-AWQ)WiNGPT2-14B-Chat-4bit。
+[2023/12/20] 新增用户微信群二维码，有效期到12月27日，扫码进群。
+[2023/12/18] 发布卫宁健康医疗模型测评方案 WiNEval-MCKQuiz的评测结果。
+[2023/12/12] 开源 WiNGPT2 14B模型权重: [🤗](https://huggingface.co/winninghealth/WiNGPT2-14B-Base)WiNGPT2-14B-Base 和 [🤗](https://huggingface.co/winninghealth/WiNGPT2-14B-Chat)WiNGPT2-14B-Chat。
+[2023/11/02] [34B模型平台测试](https://wingpt.winning.com.cn/) 和 [欢迎加入微信讨论群](https://github.com/winninghealth/WiNGPT2/blob/main/assets/WiNGPT_GROUP.JPG)
+[2023/10/13] 更新一个简单的[Chatbot示例](#部署)，可以进行简单的多轮对话。
+[2023/09/26] 开源 WiNGPT2 与7B模型权重: [🤗](https://huggingface.co/winninghealth/WiNGPT2-7B-Base)WiNGPT2-7B-Base 和 [🤗](https://huggingface.co/winninghealth/WiNGPT2-7B-Chat)WiNGPT2-7B-Chat。
+## 如何使用
+### 推理
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_path = "WiNGPT2-Gemma-2-9B-Chat"
+device = "cuda"
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+model = AutoModelForCausalLM.from_pretrained(model_path).to(device)
+model = model.eval()
+messages = [{"role": "user", "content": "WiNGPT, 你好"}]
+input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, return_tensors='pt')
+output_ids = model.generate(
+    input_ids.to(device),
+    eos_token_id=tokenizer.convert_tokens_to_ids('<end_of_turn>'),
+    max_new_tokens=1024,
+    repetition_penalty=1.1
+	)
+response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
+print(response)
+## 输出结果示例：你好！今天我能为你做些什么？
+```
+### 提示
+WiNGPT-Gemma-2-9B-Chat 使用了自定义的提示格式：
+用户角色：system/user/assistant
+chat_template:
+```jinja2
+{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<start_of_turn>' + message['role'] + '\n' + message['content'] + '<end_of_turn>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<start_of_turn>assistant\n' }}{% endif %}
+```
+*注意，我们的 chat_template 和原模型的略有不同*
+**指令提示**示例：
+```json
+[{"role": "user", "content": "WiNGPT, 你好"}]
+```
+**多轮对话**示例：
+```json
+[
+  {"role": "user", "content": "WiNGPT, 你好"},
+  {"role": "assistant", "content": "你好！今天我能为你做些什么？"},
+  {"role": "user", "content": "流感应该怎么办啊？"},
+]
+```
+**翻译功能**示例：
+```json
+[
+  {"role": "system", "content": "作为医疗领域的智能助手，WiNGPT将提供中英翻译服务。用户输入的中文或英文内容将由WiNGPT进行准确的翻译，以满足用户的语言需求。"},
+  {"role": "user", "content": "Life is short, you know, and time is so swift; Rivers are wide, so wide, and ships sail far."}
+]
+```
+## 模型卡
+###  训练配置与参数
+| 名称                    | 训练策略           | 长度 | 精度 | 学习率 | Weight_decay | Epochs | GPUs   |
+| ----------------------- | ------------------ | ---- | ---- | ------ | ------------ | ------ | ------ |
+| WiNGPT2-Gemma-2-9B-Base | 继续预训练 (23G)   | 8192 | bf16 | 5e-5   | 0.05         | 2      | A100*8 |
+| WiNGPT2-Gemma-2-9B-Chat | 微调/对齐 (45万条) | 8192 | bf16 | 5e-6   | 0.012        | 2      | A100*8 |
+### 训练数据
+预训练数据约23G，指令微调对齐数据约45万条，[详细内容](https://github.com/winninghealth/WiNGPT2?tab=readme-ov-file#%E8%AE%AD%E7%BB%83%E6%95%B0%E6%8D%AE) 。
+## 中文医疗评测 - WiNEval 2.0
+更新时间：2024-08-15
+|                                                              | Type                   | MCKQuiz-2.0 / (2024only) | MSceQA-2.0 |
+| ------------------------------------------------------------ | ---------------------- | ------------------------ | ---------- |
+| **WiNGPT2-Gemma-2-9B-Base**                                  | Continued Pre-training | 74.3 / 77.8              | /          |
+| [WiNGPT2-Llama-3-8B-Base](https://huggingface.co/winninghealth/WiNGPT2-Llama-3-8B-Base) | Continued Pre-training | 66.3 / 70.4              | /          |
+| gemma-2-9b                                                   | Pre-training           | 44.8 / 42.0              | /          |
+|                                                              |                        |                          |            |
+| **WiNGPT2-Gemma-2-9B-Chat**                                  | Finetuning/Alignment   | 73.5 / 77.8              | 82.9       |
+| gemma-2-9b-it                                                | Finetuning/Alignment   | 53.7 / 48.2              | 80.18      |
+| Llama-3.1-8B-Instruct                                        | Finetuning/Alignment   | 61.6 / 68.5              | 73.2       |
+| [WiNGPT2-Llama-3-8B-Chat](https://huggingface.co/winninghealth/WiNGPT2-Llama-3-8B-Chat) | Finetuning/Alignment   | 65.8 / 72.2              | 73.0       |
+*MCKQuiz（客观题）：17个科目分类13060选择题；输入问题和选项，让模型输出答案。根据标准答案判断对错，统计准确率。*
+*MSceQA（主观题）：由细分领域场景题目构成，包含八大业务场景，17个一级分类和32个二级分类。使用人工/模型对模型的回答进行准确性、相关性、一致性、完整性、权威性评价，并参照标准答案对模型生成的答案进行评分。*
+*MCKQuiz-2.0（客观题）：增加1%的最新2024年题库。MSceQA-2.0（主观题）：增加了45种（130%）场景题，。*
+[历史WiNEval评测结果](https://github.com/winninghealth/WiNGPT2?tab=readme-ov-file#2-%E5%8D%AB%E5%AE%81%E5%81%A5%E5%BA%B7%E5%8C%BB%E7%96%97%E6%A8%A1%E5%9E%8B%E6%B5%8B%E8%AF%84%E6%96%B9%E6%A1%88-winevalzero-shot)
+## 企业服务
+[通过WiNGPT测试平台申请密钥或与我们取得联系](https://wingpt.winning.com.cn/)
+## 局限性与免责声明
+(a) WiNGPT2 是一个专业医疗领域的大语言模型，可为一般用户提供拟人化AI医生问诊和问答功能，以及一般医学领域的知识问答。对于专业医疗人士，WiNGPT2 提供关于患者病情的诊断、用药和健康建议等方面的回答的建议仅供参考。
+(b) 您应理解 WiNGPT2 仅提供信息和建议，不能替代医疗专业人士的意见、诊断或治疗建议。在使用 WiNGPT2 的信息之前，请寻求医生或其他医疗专业人员的建议，并独立评估所提供的信息。
+(c) WiNGPT2 的信息可能存在错误或不准确。卫宁健康不对 WiNGPT2 的准确性、可靠性、完整性、质量、安全性、及时性、性能或适用性提供任何明示或暗示的保证。使用 WiNGPT2 所产生的结果和决策由您自行承担。第三方原因而给您造成的损害结果承担责任。
+## 许可证
+1. 本项目授权协议为 Apache License 2.0，模型权重需要遵守基础模型 [Gemma](https://ai.google.dev/gemma?hl=zh-cn) 相关协议及其[使用条款](https://ai.google.dev/gemma/terms)，详细内容参照其网站。
+2. 使用本项目包括模型权重时请引用本项目：https://github.com/winninghealth/WiNGPT2
+## 联系我们
+网站：https://www.winning.com.cn
+邮箱：[email protected]