TCMChat-600k / README.md

Upload README.md with huggingface_hub

754e45d verified 3 days ago

4.42 kB

	[中文](./README_ZH.md) \| [English](./README.md)

	<p align="center" width="100%">
	<a href="https://github.com/daiyizheng/TCMChat" target="_blank"><img src="logo.png" alt="TCMChat" style="width: 25%; min-width: 300px; display: block; margin: auto;"></a>
	</p>

	# TCMChat: A Generative Large Language Model for Traditional Chinese Medicine

	[![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)](https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese/blob/main/LICENSE) [![Python 3.10.12](https://img.shields.io/badge/python-3.10.12-blue.svg)](https://www.python.org/downloads/release/python-390/)

	## News
	[2024-11-1] We have fully open-sourced the model weights and training dataset on Huggingface.
	[2024-5-17] Open source model weight on HuggingFace.

	## Application

	### Install
	```shell
	git clone https://github.com/daiyizheng/TCMChat
	cd TCMChat
	```
	Create a conda environment
	```shell
	conda create -n baichuan2 python=3.10 -y
	```
	First install the dependency package. python environment 3.10+ is recommended.

	```shell
	pip install -r requirements.txt
	```

	### Weights download
	- [TCMChat](https://huggingface.co/daiyizheng/TCMChat): QA and recommendation of TCM knowledge based on baichuan2-7B-Chat.

	### Inference
	#### Command line

	```
	python cli_infer.py \
	--model_name_or_path /your/model/path \
	--model_type chat
	```

	#### Web demo

	```
	python gradio_demo.py
	```

	We provide an online tool：[https://xomics.com.cn/tcmchat](https://xomics.com.cn/tcmchat)


	### Retrain
	#### Dataset Download

	- [Pretrain dataset](https://huggingface.co/datasets/ZJUFanLab/TCMChat-dataset-600k)
	- [SFT dataset](https://huggingface.co/datasets/ZJUFanLab/TCMChat-dataset-600k)
	- [Benchmark dataset](https://github.com/ZJUFanLab/TCMChat/tree/master/evaluation/resources)


	> Note: Before performing pre-training, fine-tuning, and inference, please modify the relevant paths for your model, data, and other related files.
	#### Pre-training

	```shell
	## Slurm cluster
	sbatch scripts/pretrain/baichuan2_7b_chat.slurm
	## or
	bash scripts/pretrain/baichuan2_7b_chat.sh
	```

	#### Fine-tuning
	```shell
	## Slurm cluster
	sbatch scripts/sft/baichuan2_7b_chat.slurm
	## or
	bash scripts/sft/baichuan2_7b_chat.sh
	```
	### Training details

	Please refer to the experimental section of the paper for instructions.


	### Benchmark evaluation

	#### Choice Question
	```shell
	python evaluation/choices_evaluate/eval.py --model_path_or_name /your/model/path --model_name baichuan2-7b-chat --few_shot -sz herb --dev_file_path evaluation/resources/choice/single/tcm-herb_dev.csv --val_file_path evaluation/resources/choice/single/choice_herb_500.csv --log_dir logs/choices
	```

	#### Reading Comprehension
	```shell
	python infers/baichuan_infer.py \
	--model_name_or_path /your/model/path / \
	--model_type chat \
	--save_path /your/save/data/path \
	--data_path /your/data/path
	##BertScore
	python evaluation/question_rouge_bleu.py/question_bert_score.py
	## BLEU METEOR
	python evaluation/question_rouge_bleu.py/open_question_bleu.py
	## ROUGE-x
	python evaluation/question_rouge_bleu.py/open_question_rouge.py

	```

	#### Entity Extraction
	```shell
	python infers/baichuan_infer.py \
	--model_name_or_path /your/model/path / \
	--model_type chat \
	--save_path /your/save/data/path \
	--data_path /your/data/path

	python evaluation/ner_evaluate/tcm_entity_recognition.py

	```

	#### Medical Case Diagnosis
	```shell
	python infers/baichuan_infer.py \
	--model_name_or_path /your/model/path / \
	--model_type chat \
	--save_path /your/save/data/path \
	--data_path /your/data/path

	python evaluation/acc_evaluate/extract_syndrome.py

	```

	#### Herb or Formula Recommendation
	```shell
	python infers/baichuan_infer.py \
	--model_name_or_path /your/model/path / \
	--model_type chat \
	--save_path /your/save/data/path \
	--data_path /your/data/path

	python evaluation/recommend_evaluate/mrr_ndcg_p_r.py

	```
	### ADMET Prediction
	#### Regression
	```shell
	python infers/baichuan_infer.py \
	--model_name_or_path /your/model/path / \
	--model_type chat \
	--save_path /your/save/data/path \
	--data_path /your/data/path

	python evaluation/admet_evaluate/rmse_mae_mse.py

	```
	#### Classification
	```shell
	python infers/baichuan_infer.py \
	--model_name_or_path /your/model/path / \
	--model_type chat \
	--save_path /your/save/data/path \
	--data_path /your/data/path

	python evaluation/admet_evaluate/acc_recall_f1.py

	```