Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
[**中文**](./README_ZH.md) | [**English**](./README.md)
|
2 |
|
3 |
<p align="center" width="100%">
|
4 |
-
<a href="https://github.com/daiyizheng/TCMChat" target="_blank"><img src="
|
5 |
</p>
|
6 |
|
7 |
# TCMChat: A Generative Large Language Model for Traditional Chinese Medicine
|
@@ -9,29 +9,30 @@
|
|
9 |
[![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)](https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese/blob/main/LICENSE) [![Python 3.10.12](https://img.shields.io/badge/python-3.10.12-blue.svg)](https://www.python.org/downloads/release/python-390/)
|
10 |
|
11 |
## News
|
12 |
-
|
13 |
[2024-5-17] Open source model weight on HuggingFace.
|
14 |
|
15 |
## Application
|
16 |
|
17 |
### Install
|
18 |
-
|
19 |
-
```
|
20 |
git clone https://github.com/daiyizheng/TCMChat
|
21 |
cd TCMChat
|
22 |
```
|
|
|
|
|
|
|
|
|
23 |
First install the dependency package. python environment 3.10+ is recommended.
|
24 |
|
25 |
-
```
|
26 |
pip install -r requirements.txt
|
27 |
```
|
28 |
|
29 |
### Weights download
|
30 |
-
|
31 |
- [TCMChat](https://huggingface.co/daiyizheng/TCMChat): QA and recommendation of TCM knowledge based on baichuan2-7B-Chat.
|
32 |
|
33 |
### Inference
|
34 |
-
|
35 |
#### Command line
|
36 |
|
37 |
```
|
@@ -50,111 +51,113 @@ We provide an online tool:[https://xomics.com.cn/tcmchat](https://xomics.com.c
|
|
50 |
|
51 |
|
52 |
### Retrain
|
53 |
-
|
54 |
#### Dataset Download
|
55 |
|
56 |
-
- [Pretrain dataset](https://
|
57 |
-
- [SFT dataset](https://
|
58 |
-
- [Benchmark dataset](https://github.com/ZJUFanLab/TCMChat/tree/master/
|
59 |
-
|
60 |
-
> Note: Currently only sample data is provided. In the near future, we will fully open source the original data.
|
61 |
|
62 |
|
|
|
63 |
#### Pre-training
|
64 |
|
65 |
```shell
|
66 |
-
|
67 |
-
|
68 |
-
|
69 |
-
|
70 |
-
deepspeed_dir="data/resources/deepspeed_zero_stage2_config.yml"
|
71 |
-
num_train_epochs="2"
|
72 |
-
export WANDB_PROJECT="TCM-${train_type}"
|
73 |
-
date_time=$(date +"%Y%m%d%H%M%S")
|
74 |
-
run_name="${date_time}_${block_size}"
|
75 |
-
model_name_or_path="your/path/Baichuan2-7B-Chat"
|
76 |
-
output_dir="output/${train_type}/${date_time}_${block_size}"
|
77 |
-
|
78 |
-
|
79 |
-
accelerate launch --config_file ${deepspeed_dir} src/pretraining.py \
|
80 |
-
--model_name_or_path ${model_name_or_path} \
|
81 |
-
--train_file ${train_file} \
|
82 |
-
--validation_file ${validation_file} \
|
83 |
-
--preprocessing_num_workers 20 \
|
84 |
-
--cache_dir ./cache \
|
85 |
-
--block_size ${block_size} \
|
86 |
-
--seed 42 \
|
87 |
-
--do_train \
|
88 |
-
--do_eval \
|
89 |
-
--per_device_train_batch_size 32 \
|
90 |
-
--per_device_eval_batch_size 32 \
|
91 |
-
--num_train_epochs ${num_train_epochs} \
|
92 |
-
--low_cpu_mem_usage True \
|
93 |
-
--torch_dtype bfloat16 \
|
94 |
-
--bf16 \
|
95 |
-
--ddp_find_unused_parameters False \
|
96 |
-
--gradient_checkpointing True \
|
97 |
-
--learning_rate 2e-4 \
|
98 |
-
--warmup_ratio 0.05 \
|
99 |
-
--weight_decay 0.01 \
|
100 |
-
--report_to wandb \
|
101 |
-
--run_name ${run_name} \
|
102 |
-
--logging_dir logs \
|
103 |
-
--logging_strategy steps \
|
104 |
-
--logging_steps 10 \
|
105 |
-
--eval_steps 50 \
|
106 |
-
--evaluation_strategy steps \
|
107 |
-
--save_steps 100 \
|
108 |
-
--save_strategy steps \
|
109 |
-
--save_total_limit 13 \
|
110 |
-
--output_dir ${output_dir} \
|
111 |
-
--overwrite_output_dir
|
112 |
```
|
113 |
|
114 |
#### Fine-tuning
|
115 |
-
|
116 |
```shell
|
117 |
-
|
118 |
-
|
119 |
-
|
120 |
-
|
121 |
-
model_name_or_path="your/path/pretrain"
|
122 |
-
deepspeed_dir="data/resources/deepspeed_zero_stage2_confi_baichuan2.json"
|
123 |
-
export WANDB_PROJECT="TCM-${train_type}"
|
124 |
-
run_name="${train_type}_${date_time}"
|
125 |
-
output_dir="output/${train_type}/${date_time}_${model_max_length}"
|
126 |
-
|
127 |
-
|
128 |
-
deepspeed --hostfile="" src/fine-tune.py \
|
129 |
-
--report_to "wandb" \
|
130 |
-
--run_name ${run_name} \
|
131 |
-
--data_path ${data_path} \
|
132 |
-
--model_name_or_path ${model_name_or_path} \
|
133 |
-
--output_dir ${output_dir} \
|
134 |
-
--model_max_length ${model_max_length} \
|
135 |
-
--num_train_epochs 4 \
|
136 |
-
--per_device_train_batch_size 16 \
|
137 |
-
--gradient_accumulation_steps 1 \
|
138 |
-
--save_strategy epoch \
|
139 |
-
--learning_rate 2e-5 \
|
140 |
-
--lr_scheduler_type constant \
|
141 |
-
--adam_beta1 0.9 \
|
142 |
-
--adam_beta2 0.98 \
|
143 |
-
--adam_epsilon 1e-8 \
|
144 |
-
--max_grad_norm 1.0 \
|
145 |
-
--weight_decay 1e-4 \
|
146 |
-
--warmup_ratio 0.0 \
|
147 |
-
--logging_steps 1 \
|
148 |
-
--gradient_checkpointing True \
|
149 |
-
--deepspeed ${deepspeed_dir} \
|
150 |
-
--bf16 True \
|
151 |
-
--tf32 True
|
152 |
```
|
153 |
-
|
154 |
### Training details
|
155 |
|
156 |
Please refer to the experimental section of the paper for instructions.
|
157 |
|
158 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
159 |
|
|
|
160 |
|
|
|
|
1 |
[**中文**](./README_ZH.md) | [**English**](./README.md)
|
2 |
|
3 |
<p align="center" width="100%">
|
4 |
+
<a href="https://github.com/daiyizheng/TCMChat" target="_blank"><img src="logo.png" alt="TCMChat" style="width: 25%; min-width: 300px; display: block; margin: auto;"></a>
|
5 |
</p>
|
6 |
|
7 |
# TCMChat: A Generative Large Language Model for Traditional Chinese Medicine
|
|
|
9 |
[![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)](https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese/blob/main/LICENSE) [![Python 3.10.12](https://img.shields.io/badge/python-3.10.12-blue.svg)](https://www.python.org/downloads/release/python-390/)
|
10 |
|
11 |
## News
|
12 |
+
[2024-11-1] We have fully open-sourced the model weights and training dataset on Huggingface.
|
13 |
[2024-5-17] Open source model weight on HuggingFace.
|
14 |
|
15 |
## Application
|
16 |
|
17 |
### Install
|
18 |
+
```shell
|
|
|
19 |
git clone https://github.com/daiyizheng/TCMChat
|
20 |
cd TCMChat
|
21 |
```
|
22 |
+
Create a conda environment
|
23 |
+
```shell
|
24 |
+
conda create -n baichuan2 python=3.10 -y
|
25 |
+
```
|
26 |
First install the dependency package. python environment 3.10+ is recommended.
|
27 |
|
28 |
+
```shell
|
29 |
pip install -r requirements.txt
|
30 |
```
|
31 |
|
32 |
### Weights download
|
|
|
33 |
- [TCMChat](https://huggingface.co/daiyizheng/TCMChat): QA and recommendation of TCM knowledge based on baichuan2-7B-Chat.
|
34 |
|
35 |
### Inference
|
|
|
36 |
#### Command line
|
37 |
|
38 |
```
|
|
|
51 |
|
52 |
|
53 |
### Retrain
|
|
|
54 |
#### Dataset Download
|
55 |
|
56 |
+
- [Pretrain dataset](https://huggingface.co/datasets/ZJUFanLab/TCMChat-dataset-600k)
|
57 |
+
- [SFT dataset](https://huggingface.co/datasets/ZJUFanLab/TCMChat-dataset-600k)
|
58 |
+
- [Benchmark dataset](https://github.com/ZJUFanLab/TCMChat/tree/master/evaluation/resources)
|
|
|
|
|
59 |
|
60 |
|
61 |
+
> Note: Before performing pre-training, fine-tuning, and inference, please modify the relevant paths for your model, data, and other related files.
|
62 |
#### Pre-training
|
63 |
|
64 |
```shell
|
65 |
+
## Slurm cluster
|
66 |
+
sbatch scripts/pretrain/baichuan2_7b_chat.slurm
|
67 |
+
## or
|
68 |
+
bash scripts/pretrain/baichuan2_7b_chat.sh
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
69 |
```
|
70 |
|
71 |
#### Fine-tuning
|
|
|
72 |
```shell
|
73 |
+
## Slurm cluster
|
74 |
+
sbatch scripts/sft/baichuan2_7b_chat.slurm
|
75 |
+
## or
|
76 |
+
bash scripts/sft/baichuan2_7b_chat.sh
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
77 |
```
|
|
|
78 |
### Training details
|
79 |
|
80 |
Please refer to the experimental section of the paper for instructions.
|
81 |
|
82 |
|
83 |
+
### Benchmark evaluation
|
84 |
+
|
85 |
+
#### Choice Question
|
86 |
+
```shell
|
87 |
+
python evaluation/choices_evaluate/eval.py --model_path_or_name /your/model/path --model_name baichuan2-7b-chat --few_shot -sz herb --dev_file_path evaluation/resources/choice/single/tcm-herb_dev.csv --val_file_path evaluation/resources/choice/single/choice_herb_500.csv --log_dir logs/choices
|
88 |
+
```
|
89 |
+
|
90 |
+
#### Reading Comprehension
|
91 |
+
```shell
|
92 |
+
python infers/baichuan_infer.py \
|
93 |
+
--model_name_or_path /your/model/path / \
|
94 |
+
--model_type chat \
|
95 |
+
--save_path /your/save/data/path \
|
96 |
+
--data_path /your/data/path
|
97 |
+
##BertScore
|
98 |
+
python evaluation/question_rouge_bleu.py/question_bert_score.py
|
99 |
+
## BLEU METEOR
|
100 |
+
python evaluation/question_rouge_bleu.py/open_question_bleu.py
|
101 |
+
## ROUGE-x
|
102 |
+
python evaluation/question_rouge_bleu.py/open_question_rouge.py
|
103 |
+
|
104 |
+
```
|
105 |
+
|
106 |
+
#### Entity Extraction
|
107 |
+
```shell
|
108 |
+
python infers/baichuan_infer.py \
|
109 |
+
--model_name_or_path /your/model/path / \
|
110 |
+
--model_type chat \
|
111 |
+
--save_path /your/save/data/path \
|
112 |
+
--data_path /your/data/path
|
113 |
+
|
114 |
+
python evaluation/ner_evaluate/tcm_entity_recognition.py
|
115 |
+
|
116 |
+
```
|
117 |
+
|
118 |
+
#### Medical Case Diagnosis
|
119 |
+
```shell
|
120 |
+
python infers/baichuan_infer.py \
|
121 |
+
--model_name_or_path /your/model/path / \
|
122 |
+
--model_type chat \
|
123 |
+
--save_path /your/save/data/path \
|
124 |
+
--data_path /your/data/path
|
125 |
+
|
126 |
+
python evaluation/acc_evaluate/extract_syndrome.py
|
127 |
+
|
128 |
+
```
|
129 |
+
|
130 |
+
#### Herb or Formula Recommendation
|
131 |
+
```shell
|
132 |
+
python infers/baichuan_infer.py \
|
133 |
+
--model_name_or_path /your/model/path / \
|
134 |
+
--model_type chat \
|
135 |
+
--save_path /your/save/data/path \
|
136 |
+
--data_path /your/data/path
|
137 |
+
|
138 |
+
python evaluation/recommend_evaluate/mrr_ndcg_p_r.py
|
139 |
+
|
140 |
+
```
|
141 |
+
### ADMET Prediction
|
142 |
+
#### Regression
|
143 |
+
```shell
|
144 |
+
python infers/baichuan_infer.py \
|
145 |
+
--model_name_or_path /your/model/path / \
|
146 |
+
--model_type chat \
|
147 |
+
--save_path /your/save/data/path \
|
148 |
+
--data_path /your/data/path
|
149 |
+
|
150 |
+
python evaluation/admet_evaluate/rmse_mae_mse.py
|
151 |
+
|
152 |
+
```
|
153 |
+
#### Classification
|
154 |
+
```shell
|
155 |
+
python infers/baichuan_infer.py \
|
156 |
+
--model_name_or_path /your/model/path / \
|
157 |
+
--model_type chat \
|
158 |
+
--save_path /your/save/data/path \
|
159 |
+
--data_path /your/data/path
|
160 |
|
161 |
+
python evaluation/admet_evaluate/acc_recall_f1.py
|
162 |
|
163 |
+
```
|