Contrastive learning with large language models for medical code prediction

被引:0
作者
Wu, Yuzhou [1 ]
Zhang, Jin [1 ]
Chen, Xuechen [2 ]
Yao, Xin [2 ]
Chen, Zhigang [1 ]
机构
[1] Changsha Univ Sci & Technol, Sch Comp Sci & Engn, Changsha 410114, Peoples R China
[2] Cent South Univ, Sch Comp Sci & Engn, Changsha 410012, Peoples R China
关键词
International classification of diseases; Automatic ICD coding; Code hierarchy; Contrastive learning; Imbalanced label distribution;
D O I
10.1016/j.eswa.2025.127241
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Medical code prediction is an important method for unstructured electronic health records (EHRs) analysis. It is a disease classification method introduced by the Worldwide Health Organization (WHO) that can associate EMR with the corresponding medical code. The challenges of automatic medical code prediction mainly include the long clinical text and long-tailed label distribution. Prior studies have designed pre-trained language models (PLM) for medical code prediction. However, These methods either did not solve the problem of input text length limit, or they did not propose additional methods for better code representations to solve the problem of long-tailed label distribution. In this work, we propose a novel contrastive learning framework with a large language model (CL-LLM) for medical code prediction. CL-LLM exploits the large language model (LLM) in solving the long-tailed label distribution problem by designing a prompt method to generate synonymous code descriptions. In addition, CL-LLM uses contrastive learning to inject synonyms into code description encoder to enhance the model's few-shot label prediction. To solve the problem of long input text, we perform PLMClinicalBERT as a clinical text encoder and split pooling to segment long input text. We conducted experiments on the public MIMIC-III and MIMIC-IV datasets. The results on MIMIC III and MIMIC-IV datasets show that our model outperforms previous state-of-the-art methods for automatic ICD coding. We also conducted ablation experiments to prove the importance of each component in CL-LLM. To further verify the performance on rare labels, we test our methods on the MIMIC-III RARE50 dataset and achieve predominant results.
引用
收藏
页数:9
相关论文
共 52 条
[1]  
2023, Arxiv, DOI arXiv:2303.08774
[2]  
Alsentzer E., 2019, P 2 CLIN NAT LANG PR, P72
[3]  
Arash A., 2022, P 13 INT WORKSH HLTH, P100
[4]   TransICD: Transformer Based Code-Wise Attention Model for Explainable ICD Coding [J].
Biswas, Biplob ;
Pham, Thai-Hoang ;
Zhang, Ping .
ARTIFICIAL INTELLIGENCE IN MEDICINE (AIME 2021), 2021, :469-478
[5]  
Brown TB, 2020, ADV NEUR IN, V33
[6]  
Cao PF, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P3105
[7]  
Cao Y., 2023, IEEE Journal of Biomedical and Health Informatics
[8]  
Chowdhery A, 2022, Arxiv, DOI [arXiv:2204.02311, DOI 10.48550/ARXIV.2204.02311]
[9]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[10]  
Glasheen WP, 2019, AM HEALTH DRUG BENEF, V12, P188