File size: 4,602 Bytes

---
language: en
tags:
- TCM
- chinese-medicine
- conversational
license: apache-2.0
datasets:
- ZJUFanLab/TCMChat-dataset-600k
model-index:
- name: TCMChat-600k
  results: []
---

[**中文**](./README_ZH.md) | [**English**](./README.md)

<p align="center" width="100%">
<a href="https://github.com/daiyizheng/TCMChat" target="_blank"><img src="logo.png" alt="TCMChat" style="width: 25%; min-width: 300px; display: block; margin: auto;"></a>
</p>

# TCMChat: A Generative Large Language Model for Traditional Chinese Medicine

[![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)](https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese/blob/main/LICENSE) [![Python 3.10.12](https://img.shields.io/badge/python-3.10.12-blue.svg)](https://www.python.org/downloads/release/python-390/)

## News
[2024-11-1] We have fully open-sourced the model weights and training dataset on Huggingface.                   
[2024-5-17] Open source model weight on HuggingFace.                 

## Application

### Install
```shell
git clone https://github.com/daiyizheng/TCMChat
cd TCMChat
```
Create a conda environment
```shell
conda create -n baichuan2 python=3.10 -y
```
First install the dependency package. python environment 3.10+ is recommended.

```shell
pip install -r requirements.txt
```

### Weights download
- [TCMChat](https://huggingface.co/daiyizheng/TCMChat): QA and recommendation of TCM knowledge based on baichuan2-7B-Chat.

### Inference
#### Command line

```
python cli_infer.py \
--model_name_or_path /your/model/path \
--model_type  chat
```

#### Web demo

```
python gradio_demo.py
```

We provide an online tool：[https://xomics.com.cn/tcmchat](https://xomics.com.cn/tcmchat)


### Retrain
#### Dataset Download

- [Pretrain dataset](https://huggingface.co/datasets/ZJUFanLab/TCMChat-dataset-600k) 
- [SFT dataset](https://huggingface.co/datasets/ZJUFanLab/TCMChat-dataset-600k)
- [Benchmark dataset](https://github.com/ZJUFanLab/TCMChat/tree/master/evaluation/resources)


> Note: Before performing pre-training, fine-tuning, and inference, please modify the relevant paths for your model, data, and other related files.
#### Pre-training

```shell
## Slurm cluster
sbatch scripts/pretrain/baichuan2_7b_chat.slurm
## or
bash scripts/pretrain/baichuan2_7b_chat.sh
```

#### Fine-tuning
```shell
## Slurm cluster
sbatch scripts/sft/baichuan2_7b_chat.slurm
## or
bash scripts/sft/baichuan2_7b_chat.sh
```
### Training details

Please refer to the experimental section of the paper for instructions.


### Benchmark evaluation

#### Choice Question
```shell
python evaluation/choices_evaluate/eval.py   --model_path_or_name /your/model/path --model_name  baichuan2-7b-chat --few_shot -sz herb --dev_file_path evaluation/resources/choice/single/tcm-herb_dev.csv --val_file_path evaluation/resources/choice/single/choice_herb_500.csv --log_dir logs/choices
```

#### Reading Comprehension
```shell
python infers/baichuan_infer.py \
--model_name_or_path /your/model/path / \
--model_type chat \
--save_path /your/save/data/path \
--data_path /your/data/path
##BertScore
python evaluation/question_rouge_bleu.py/question_bert_score.py
## BLEU METEOR
python evaluation/question_rouge_bleu.py/open_question_bleu.py
## ROUGE-x
python evaluation/question_rouge_bleu.py/open_question_rouge.py

```

#### Entity Extraction
```shell
python infers/baichuan_infer.py \
--model_name_or_path /your/model/path / \
--model_type chat \
--save_path /your/save/data/path \
--data_path /your/data/path

python evaluation/ner_evaluate/tcm_entity_recognition.py

```

#### Medical Case Diagnosis
```shell
python infers/baichuan_infer.py \
--model_name_or_path /your/model/path / \
--model_type chat \
--save_path /your/save/data/path \
--data_path /your/data/path

python evaluation/acc_evaluate/extract_syndrome.py

```

#### Herb or Formula Recommendation
```shell
python infers/baichuan_infer.py \
--model_name_or_path /your/model/path / \
--model_type chat \
--save_path /your/save/data/path \
--data_path /your/data/path

python evaluation/recommend_evaluate/mrr_ndcg_p_r.py

```
### ADMET Prediction
#### Regression
```shell
python infers/baichuan_infer.py \
--model_name_or_path /your/model/path / \
--model_type chat \
--save_path /your/save/data/path \
--data_path /your/data/path

python evaluation/admet_evaluate/rmse_mae_mse.py

```
#### Classification
```shell
python infers/baichuan_infer.py \
--model_name_or_path /your/model/path / \
--model_type chat \
--save_path /your/save/data/path \
--data_path /your/data/path

python evaluation/admet_evaluate/acc_recall_f1.py

```