TCMChat: A generative large language model for traditional Chinese medicine

被引：0

作者：

Dai, Yizheng ^{[1
,2
]}

Shao, Xin ^{[1
,2
,3
]}

Zhang, Jinlu ^{[1
,2
]}

Chen, Yulong ^{[2
]}

Chen, Qian ^{[1
,2
,3
]}

Liao, Jie ^{[1
,2
]}

Chi, Fei ^{[2
]}

Zhang, Junhua ^{[4
]}

Fan, Xiaohui ^{[1
,2
,3
,5
]}

机构：

[1] Zhejiang Univ, Pharmaceut Informat Inst, Coll Pharmaceut Sci, Hangzhou 310058, Peoples R China

[2] Zhejiang Univ, Innovat Ctr Yangtze River Delta, State Key Lab Chinese Med Modernizat, Jiaxing 314103, Peoples R China

[3] Ningbo Municipal Hosp TCM, Joint Lab Clin Multiomics Res Zhejiang Univ & Ning, Ningbo 315000, Peoples R China

[4] Tianjin Univ Tradit Chinese Med, State Key Lab Chinese Med Modernizat, Tianjin 301617, Peoples R China

[5] Zhejiang Univ, Womens Hosp, Sch Med, Zhejiang Key Lab Precis Diag & Therapy Major Gynec, Hangzhou 310006, Peoples R China

来源：

PHARMACOLOGICAL RESEARCH | 2024年 / 210卷

关键词：

Traditional Chinese medicine; Large language model; Dialogue system; Pre-training; Supervised fine-tuning;

D O I：

10.1016/j.phrs.2024.107530

中图分类号：

R9 [药学];

学科分类号：

1007 ;

摘要：

The utilization of ground-breaking large language models (LLMs) accompanied with dialogue system has been progressively prevalent in the medical domain. Nevertheless, the expertise of LLMs in Traditional Chinese Medicine (TCM) remains restricted despite several TCM LLMs proposed recently. Herein, we introduced TCMChat (https://xomics.com.cn/tcmchat), a generative LLM with pre-training (PT) and supervised fine-tuning (SFT) on large-scale curated TCM text knowledge and Chinese Question-Answering (QA) datasets. In detail, we first compiled a customized collection of six scenarios of Chinese medicine as the training set by text mining and manual verification, involving TCM knowledgebase, choice question, reading comprehension, entity extraction, medical case diagnosis, and herb or formula recommendation. Next, we subjected the model to PT and SFT, using the Baichuan2-7B-Chat as the foundation model. The benchmarking datasets and case studies further demonstrate the superior performance of TCMChat in comparison to existing models. Our code, data and model are publicly released on GitHub (https://github.com/ZJUFanLab/TCMChat) and HuggingFace (https://huggingface. co/ZJUFanLab), providing high-quality knowledgebase for the research of TCM modernization with a userfriendly dialogue web tool.

引用

页数：15

共 54 条

[11] Falcon LLM Team, 2023, Arxiv, DOI [arXiv:2311.16867, 10.48550/arXiv.2311.16867]
[12] HERB: a high-throughput experiment- and reference-guided database of traditional Chinese medicine
Fang, ShuangSang
Dong, Lei
Liu, Liu
Guo, JinCheng
Zhao, LianHe
Zhang, JiaYuan
Bu, DeChao
Liu, XinKui
Huo, PeiPei
Cao, WanChen
Dong, QiongYe
Wu, JiaRui
Zeng, Xiaoxi
Wu, Yang
Zhao, Yi
[J]. NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) : D1197 - D1206
[13] Fang Y, 2024, Arxiv, DOI arXiv:2306.08018
[14] Feder A, 2021, Arxiv, DOI arXiv:2005.13407
[15] Recent Advances in Boron-Containing Acenes: Synthesis, Properties, and Optoelectronic Applications
Guo, Yongkang
Chen, Cheng
Wang, Xiao-Ye
[J]. CHINESE JOURNAL OF CHEMISTRY, 2023, 41 (11): : 1355 - 1373
[16] Han X, 2021, Arxiv, DOI [arXiv:2106.07139, 10.48550/arXiv.2106.07139, DOI 10.48550/ARXIV.2106.07139]
[17] Chemprop: A Machine Learning Package for Chemical Property Prediction
Heid, Esther
Greenman, Kevin P.
Chung, Yunsie
Li, Shih-Cheng
Graff, David E.
Vermeire, Florence H.
Wu, Haoyang
Green, William H.
Mcgill, Charles J.
[J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 64 (01) : 9 - 17
[18] Hu EJ, 2021, Arxiv, DOI arXiv:2106.09685
[19] Huang YZ, 2023, Arxiv, DOI [arXiv:2305.08322, 10.48550/arXiv.2305.08322]
[20] A rapid advice guideline for the diagnosis and treatment of 2019 novel coronavirus (2019-nCoV) infected pneumonia (standard version)
Jin, Ying-Hui
Cai, Lin
Cheng, Zhen-Shun
Cheng, Hong
Deng, Tong
Fan, Yi-Pin
Fang, Cheng
Huang, Di
Huang, Lu-Qi
Huang, Qiao
Han, Yong
Hu, Bo
Hu, Fen
Li, Bing-Hui
Li, Yi-Rong
Liang, Ke
Lin, Li-Kai
Luo, Li-Sha
Ma, Jing
Ma, Lin-Lu
Peng, Zhi-Yong
Pan, Yun-Bao
Pan, Zhen-Yu
Ren, Xue-Qun
Sun, Hui-Min
Wang, Ying
Wang, Yun-Yun
Weng, Hong
Wei, Chao-Jie
Wu, Dong-Fang
Xia, Jian
Xiong, Yong
Xu, Hai-Bo
Yao, Xiao-Mei
Yuan, Yu-Feng
Ye, Tai-Sheng
Zhang, Xiao-Chun
Zhang, Ying-Wen
Zhang, Yin-Gao
Zhang, Hua-Min
Zhao, Yan
Zhao, Ming-Juan
Zi, Hao
Zeng, Xian-Tao
Wang, Yong-Yan
Wang, Xing-Huan
[J]. MILITARY MEDICAL RESEARCH, 2020, 7 (01)

← 1 2 3 4 5 6 →