STAF-LLM: A scalable and task-adaptive fine-tuning framework for large language models in medical domain

被引:0
作者
Xu, Tianhan [1 ,2 ]
Chen, Ling [1 ,2 ]
Hu, Zhe [3 ]
Li, Bin [1 ,2 ]
机构
[1] Yangzhou Univ, Sch Informat Engn, Yangzhou 225127, Jiangsu, Peoples R China
[2] Jiangsu Prov Engn Res Ctr Knowledge Management & I, Yangzhou 225127, Jiangsu, Peoples R China
[3] Hong Kong Polytech Univ, Dept Comp, Hong Kong 999077, Peoples R China
基金
中国国家自然科学基金;
关键词
Large language models; Task-adaptive fine-tuning; Knowledge transfer; Scalable; Medical applications;
D O I
10.1016/j.eswa.2025.127582
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent large language models (LLMs) have demonstrated remarkable performance across various NLP tasks. However, their application in the medical domain is often limited by a lack of specialized medical knowledge, which is crucial for practical clinical tasks. In this work, we propose STAF-LLM, a scalable and task-adaptive fine-tuning framework designed to customize general-purpose LLMs for diverse downstream medical applications. STAF-LLM consists of two stages: expert model training and task adaptation. In the first stage, we design 12 core medical tasks and use AdaLoRA to train 12 expert models on these tasks with a unified instruction format, transferring the learned domain-specific knowledge to the general-purpose LLM. In the second stage, a task-guided router is trained for each downstream application to adaptively combine the expert knowledge with the LLM, dynamically selecting the most relevant knowledge for inference. Experiments on 9 medical tasks, including 3 unseen ones, show that STAF-LLM outperforms Llama 2 by 10%-30%. Notably, STAF-LLM achieves state-of-the-art performance on benchmark tasks like ICD coding.
引用
收藏
页数:13
相关论文
共 69 条
[1]  
Abacha A. B., 2019, BMC Bioinformatics, V20
[2]  
Anil R, 2023, Arxiv, DOI [arXiv:2305.10403, DOI 10.48550/ARXIV.2305.10403, 10.48550/arXiv.2305.10403]
[3]  
[Anonymous], 2014, P C EMP METH NAT LAN, DOI DOI 10.3115/V1/D14-1181
[4]  
[Anonymous], 2018, P 2018 C N AM CHAPT, DOI [10.18653/v1/N18-1140, DOI 10.18653/V1/N18-1140]
[5]  
Aribandi Vamsi, 2022, INT C LEARN REPR
[6]  
Ben Abacha A, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P2228
[7]   The Unified Medical Language System (UMLS): integrating biomedical terminology [J].
Bodenreider, O .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D267-D270
[8]  
Brown TB, 2020, ADV NEUR IN, V33
[9]  
Chen SJ, 2024, Arxiv, DOI [arXiv:2109.09138, DOI 10.1145/3663363]
[10]  
Chen ZM, 2023, Arxiv, DOI [arXiv:2311.16079, 10.48550/arXiv.2311.16079]