ZJUFanLab commited on
Commit
754e45d
1 Parent(s): 1da99a6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +99 -96
README.md CHANGED
@@ -1,7 +1,7 @@
1
  [**中文**](./README_ZH.md) | [**English**](./README.md)
2
 
3
  <p align="center" width="100%">
4
- <a href="https://github.com/daiyizheng/TCMChat" target="_blank"><img src="./logo.png" alt="TCMChat" style="width: 25%; min-width: 300px; display: block; margin: auto;"></a>
5
  </p>
6
 
7
  # TCMChat: A Generative Large Language Model for Traditional Chinese Medicine
@@ -9,29 +9,30 @@
9
  [![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)](https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese/blob/main/LICENSE) [![Python 3.10.12](https://img.shields.io/badge/python-3.10.12-blue.svg)](https://www.python.org/downloads/release/python-390/)
10
 
11
  ## News
12
-
13
  [2024-5-17] Open source model weight on HuggingFace.
14
 
15
  ## Application
16
 
17
  ### Install
18
-
19
- ```
20
  git clone https://github.com/daiyizheng/TCMChat
21
  cd TCMChat
22
  ```
 
 
 
 
23
  First install the dependency package. python environment 3.10+ is recommended.
24
 
25
- ```
26
  pip install -r requirements.txt
27
  ```
28
 
29
  ### Weights download
30
-
31
  - [TCMChat](https://huggingface.co/daiyizheng/TCMChat): QA and recommendation of TCM knowledge based on baichuan2-7B-Chat.
32
 
33
  ### Inference
34
-
35
  #### Command line
36
 
37
  ```
@@ -50,111 +51,113 @@ We provide an online tool:[https://xomics.com.cn/tcmchat](https://xomics.com.c
50
 
51
 
52
  ### Retrain
53
-
54
  #### Dataset Download
55
 
56
- - [Pretrain dataset](https://github.com/ZJUFanLab/TCMChat/tree/master/data/pretrain)
57
- - [SFT dataset](https://github.com/ZJUFanLab/TCMChat/tree/master/data/sft)
58
- - [Benchmark dataset](https://github.com/ZJUFanLab/TCMChat/tree/master/data/evaluate)
59
-
60
- > Note: Currently only sample data is provided. In the near future, we will fully open source the original data.
61
 
62
 
 
63
  #### Pre-training
64
 
65
  ```shell
66
- train_type="pretrain"
67
- train_file="data/pretrain/train"
68
- validation_file="data/pretrain/test"
69
- block_size="1024"
70
- deepspeed_dir="data/resources/deepspeed_zero_stage2_config.yml"
71
- num_train_epochs="2"
72
- export WANDB_PROJECT="TCM-${train_type}"
73
- date_time=$(date +"%Y%m%d%H%M%S")
74
- run_name="${date_time}_${block_size}"
75
- model_name_or_path="your/path/Baichuan2-7B-Chat"
76
- output_dir="output/${train_type}/${date_time}_${block_size}"
77
-
78
-
79
- accelerate launch --config_file ${deepspeed_dir} src/pretraining.py \
80
- --model_name_or_path ${model_name_or_path} \
81
- --train_file ${train_file} \
82
- --validation_file ${validation_file} \
83
- --preprocessing_num_workers 20 \
84
- --cache_dir ./cache \
85
- --block_size ${block_size} \
86
- --seed 42 \
87
- --do_train \
88
- --do_eval \
89
- --per_device_train_batch_size 32 \
90
- --per_device_eval_batch_size 32 \
91
- --num_train_epochs ${num_train_epochs} \
92
- --low_cpu_mem_usage True \
93
- --torch_dtype bfloat16 \
94
- --bf16 \
95
- --ddp_find_unused_parameters False \
96
- --gradient_checkpointing True \
97
- --learning_rate 2e-4 \
98
- --warmup_ratio 0.05 \
99
- --weight_decay 0.01 \
100
- --report_to wandb \
101
- --run_name ${run_name} \
102
- --logging_dir logs \
103
- --logging_strategy steps \
104
- --logging_steps 10 \
105
- --eval_steps 50 \
106
- --evaluation_strategy steps \
107
- --save_steps 100 \
108
- --save_strategy steps \
109
- --save_total_limit 13 \
110
- --output_dir ${output_dir} \
111
- --overwrite_output_dir
112
  ```
113
 
114
  #### Fine-tuning
115
-
116
  ```shell
117
- train_type="SFT"
118
- model_max_length="1024"
119
- date_time=$(date +"%Y%m%d%H%M%S")
120
- data_path="data/sft/sample_train_baichuan_data.json"
121
- model_name_or_path="your/path/pretrain"
122
- deepspeed_dir="data/resources/deepspeed_zero_stage2_confi_baichuan2.json"
123
- export WANDB_PROJECT="TCM-${train_type}"
124
- run_name="${train_type}_${date_time}"
125
- output_dir="output/${train_type}/${date_time}_${model_max_length}"
126
-
127
-
128
- deepspeed --hostfile="" src/fine-tune.py \
129
- --report_to "wandb" \
130
- --run_name ${run_name} \
131
- --data_path ${data_path} \
132
- --model_name_or_path ${model_name_or_path} \
133
- --output_dir ${output_dir} \
134
- --model_max_length ${model_max_length} \
135
- --num_train_epochs 4 \
136
- --per_device_train_batch_size 16 \
137
- --gradient_accumulation_steps 1 \
138
- --save_strategy epoch \
139
- --learning_rate 2e-5 \
140
- --lr_scheduler_type constant \
141
- --adam_beta1 0.9 \
142
- --adam_beta2 0.98 \
143
- --adam_epsilon 1e-8 \
144
- --max_grad_norm 1.0 \
145
- --weight_decay 1e-4 \
146
- --warmup_ratio 0.0 \
147
- --logging_steps 1 \
148
- --gradient_checkpointing True \
149
- --deepspeed ${deepspeed_dir} \
150
- --bf16 True \
151
- --tf32 True
152
  ```
153
-
154
  ### Training details
155
 
156
  Please refer to the experimental section of the paper for instructions.
157
 
158
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
159
 
 
160
 
 
 
1
  [**中文**](./README_ZH.md) | [**English**](./README.md)
2
 
3
  <p align="center" width="100%">
4
+ <a href="https://github.com/daiyizheng/TCMChat" target="_blank"><img src="logo.png" alt="TCMChat" style="width: 25%; min-width: 300px; display: block; margin: auto;"></a>
5
  </p>
6
 
7
  # TCMChat: A Generative Large Language Model for Traditional Chinese Medicine
 
9
  [![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)](https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese/blob/main/LICENSE) [![Python 3.10.12](https://img.shields.io/badge/python-3.10.12-blue.svg)](https://www.python.org/downloads/release/python-390/)
10
 
11
  ## News
12
+ [2024-11-1] We have fully open-sourced the model weights and training dataset on Huggingface.
13
  [2024-5-17] Open source model weight on HuggingFace.
14
 
15
  ## Application
16
 
17
  ### Install
18
+ ```shell
 
19
  git clone https://github.com/daiyizheng/TCMChat
20
  cd TCMChat
21
  ```
22
+ Create a conda environment
23
+ ```shell
24
+ conda create -n baichuan2 python=3.10 -y
25
+ ```
26
  First install the dependency package. python environment 3.10+ is recommended.
27
 
28
+ ```shell
29
  pip install -r requirements.txt
30
  ```
31
 
32
  ### Weights download
 
33
  - [TCMChat](https://huggingface.co/daiyizheng/TCMChat): QA and recommendation of TCM knowledge based on baichuan2-7B-Chat.
34
 
35
  ### Inference
 
36
  #### Command line
37
 
38
  ```
 
51
 
52
 
53
  ### Retrain
 
54
  #### Dataset Download
55
 
56
+ - [Pretrain dataset](https://huggingface.co/datasets/ZJUFanLab/TCMChat-dataset-600k)
57
+ - [SFT dataset](https://huggingface.co/datasets/ZJUFanLab/TCMChat-dataset-600k)
58
+ - [Benchmark dataset](https://github.com/ZJUFanLab/TCMChat/tree/master/evaluation/resources)
 
 
59
 
60
 
61
+ > Note: Before performing pre-training, fine-tuning, and inference, please modify the relevant paths for your model, data, and other related files.
62
  #### Pre-training
63
 
64
  ```shell
65
+ ## Slurm cluster
66
+ sbatch scripts/pretrain/baichuan2_7b_chat.slurm
67
+ ## or
68
+ bash scripts/pretrain/baichuan2_7b_chat.sh
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
  ```
70
 
71
  #### Fine-tuning
 
72
  ```shell
73
+ ## Slurm cluster
74
+ sbatch scripts/sft/baichuan2_7b_chat.slurm
75
+ ## or
76
+ bash scripts/sft/baichuan2_7b_chat.sh
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
  ```
 
78
  ### Training details
79
 
80
  Please refer to the experimental section of the paper for instructions.
81
 
82
 
83
+ ### Benchmark evaluation
84
+
85
+ #### Choice Question
86
+ ```shell
87
+ python evaluation/choices_evaluate/eval.py --model_path_or_name /your/model/path --model_name baichuan2-7b-chat --few_shot -sz herb --dev_file_path evaluation/resources/choice/single/tcm-herb_dev.csv --val_file_path evaluation/resources/choice/single/choice_herb_500.csv --log_dir logs/choices
88
+ ```
89
+
90
+ #### Reading Comprehension
91
+ ```shell
92
+ python infers/baichuan_infer.py \
93
+ --model_name_or_path /your/model/path / \
94
+ --model_type chat \
95
+ --save_path /your/save/data/path \
96
+ --data_path /your/data/path
97
+ ##BertScore
98
+ python evaluation/question_rouge_bleu.py/question_bert_score.py
99
+ ## BLEU METEOR
100
+ python evaluation/question_rouge_bleu.py/open_question_bleu.py
101
+ ## ROUGE-x
102
+ python evaluation/question_rouge_bleu.py/open_question_rouge.py
103
+
104
+ ```
105
+
106
+ #### Entity Extraction
107
+ ```shell
108
+ python infers/baichuan_infer.py \
109
+ --model_name_or_path /your/model/path / \
110
+ --model_type chat \
111
+ --save_path /your/save/data/path \
112
+ --data_path /your/data/path
113
+
114
+ python evaluation/ner_evaluate/tcm_entity_recognition.py
115
+
116
+ ```
117
+
118
+ #### Medical Case Diagnosis
119
+ ```shell
120
+ python infers/baichuan_infer.py \
121
+ --model_name_or_path /your/model/path / \
122
+ --model_type chat \
123
+ --save_path /your/save/data/path \
124
+ --data_path /your/data/path
125
+
126
+ python evaluation/acc_evaluate/extract_syndrome.py
127
+
128
+ ```
129
+
130
+ #### Herb or Formula Recommendation
131
+ ```shell
132
+ python infers/baichuan_infer.py \
133
+ --model_name_or_path /your/model/path / \
134
+ --model_type chat \
135
+ --save_path /your/save/data/path \
136
+ --data_path /your/data/path
137
+
138
+ python evaluation/recommend_evaluate/mrr_ndcg_p_r.py
139
+
140
+ ```
141
+ ### ADMET Prediction
142
+ #### Regression
143
+ ```shell
144
+ python infers/baichuan_infer.py \
145
+ --model_name_or_path /your/model/path / \
146
+ --model_type chat \
147
+ --save_path /your/save/data/path \
148
+ --data_path /your/data/path
149
+
150
+ python evaluation/admet_evaluate/rmse_mae_mse.py
151
+
152
+ ```
153
+ #### Classification
154
+ ```shell
155
+ python infers/baichuan_infer.py \
156
+ --model_name_or_path /your/model/path / \
157
+ --model_type chat \
158
+ --save_path /your/save/data/path \
159
+ --data_path /your/data/path
160
 
161
+ python evaluation/admet_evaluate/acc_recall_f1.py
162
 
163
+ ```