Efficient Inference Offloading for Mixture-of-Experts Large Language Models in Internet of Medical Things

被引:1
作者
Yuan, Xiaoming [1 ,2 ]
Kong, Weixuan [1 ]
Luo, Zhenyu [1 ]
Xu, Minrui [3 ]
机构
[1] Northeastern Univ Qinhuangdao, Hebei Key Lab Marine Percept Network & Data Proc, Qinhuangdao 066004, Peoples R China
[2] Xidian Univ, State Key Lab Integrated Serv Networks, Xian 710071, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
基金
中国国家自然科学基金;
关键词
large language models; efficient inference offloading; mixture-of-experts; Internet of Medical Things;
D O I
10.3390/electronics13112077
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite recent significant advancements in large language models (LLMs) for medical services, the deployment difficulties of LLMs in e-healthcare hinder complex medical applications in the Internet of Medical Things (IoMT). People are increasingly concerned about e-healthcare risks and privacy protection. Existing LLMs face difficulties in providing accurate medical questions and answers (Q&As) and meeting the deployment resource demands in the IoMT. To address these challenges, we propose MedMixtral 8x7B, a new medical LLM based on the mixture-of-experts (MoE) architecture with an offloading strategy, enabling deployment on the IoMT, improving the privacy protection for users. Additionally, we find that the significant factors affecting latency include the method of device interconnection, the location of offloading servers, and the speed of the disk.
引用
收藏
页数:17
相关论文
共 50 条
[41]   Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions [J].
Abd-alrazaq, Alaa ;
AlSaad, Rawan ;
Alhuwail, Dari ;
Ahmed, Arfan ;
Healy, Padraig Mark ;
Latifi, Syed ;
Aziz, Sarah ;
Damseh, Rafat ;
Alrazak, Sadam Alabed ;
Sheikh, Javaid .
JMIR MEDICAL EDUCATION, 2023, 9
[42]   Large language models in medical and healthcare fields: applications, advances, and challenges [J].
Wang, Dandan ;
Zhang, Shiqing .
ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (11)
[43]   Harnessing the potential of large language models in medical education: promise and pitfalls [J].
Benitez, Trista M. ;
Xu, Yueyuan ;
Boudreau, J. Donald ;
Kow, Alfred Wei Chieh ;
Bello, Fernando ;
Phuoc, Le Van ;
Wang, Xiaofei ;
Sun, Xiaodong ;
Leung, Gilberto Ka-Kit ;
Lan, Yanyan ;
Wang, Yaxing ;
Cheng, Davy ;
Tham, Yih-Chung ;
Wong, Tien Yin ;
Chung, Kevin C. .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (03) :776-783
[44]   WTC3D: An Efficient Neural Network for Noncontact Pulse Acquisition in Internet of Medical Things [J].
Zhao, Changchen ;
Cao, Pengcheng ;
Hu, Meng ;
Huang, Bin ;
Chen, Huiling ;
Li, Jing .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2025, 21 (02) :1547-1556
[45]   Predicting Immunotherapy Response in Unresectable Hepatocellular Carcinoma: A Comparative Study of Large Language Models and Human Experts [J].
Jun Xu ;
Junjie Wang ;
Junjun Li ;
Zhangxiang Zhu ;
Xiao Fu ;
Wei Cai ;
Ruipeng Song ;
Tengfei Wang ;
Hai Li .
Journal of Medical Systems, 49 (1)
[46]   Maximising the availability of an internet of medical things system using surrogate models and nature-inspired approaches [J].
Santos, Guto Leoni ;
Gomes, Demis ;
Silva, Francisco Airton ;
Endo, Patricia Takako ;
Lynn, Theo .
INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2022, 13 (2-3) :291-308
[47]   Using Generative Large Language Models for Hierarchical Relationship Prediction in Medical Ontologies [J].
Liu, Hao ;
Zhou, Shuxin ;
Chen, Zhehuan ;
Perl, Yehoshua ;
Wang, Jiayin .
2024 IEEE 12TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS, ICHI 2024, 2024, :248-256
[48]   Assessing large language models as assistive tools in medical consultations for Kawasaki disease [J].
Yan, Chunyi ;
Li, Zexi ;
Liang, Yongzhou ;
Shao, Shuran ;
Ma, Fan ;
Zhang, Nanjun ;
Li, Bowen ;
Wang, Chuan ;
Zhou, Kaiyu .
FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2025, 8
[49]   Large Language Models for Clinical Text Cleansing Enhance Medical Concept Normalization [J].
Abdulnazar, Akhila ;
Roller, Roland ;
Schulz, Stefan ;
Kreuzthaler, Markus .
IEEE ACCESS, 2024, 12 :147981-147990
[50]   Ethical Considerations and Fundamental Principles of Large Language Models in Medical Education: Viewpoint [J].
Li, Zhui ;
Li, Fenghe ;
Wang, Xuehu ;
Fu, Qining ;
Ren, Wei .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26