Fine-tuning large language models for improved health communication in low-resource languages

被引:0
|
作者
Bui, Nhat [1 ]
Nguyen, Giang [1 ]
Nguyen, Nguyen [1 ]
Vo, Bao [1 ]
Vo, Luan [1 ]
Huynh, Tom [1 ]
Tang, Arthur [1 ]
Tran, Van Nhiem [2 ]
Huynh, Tuyen [3 ]
Nguyen, Huy Quang [3 ]
Dinh, Minh [1 ]
机构
[1] RMIT Univ, Sch Sci Engn & Technol, Ho Chi Minh City, Vietnam
[2] Hon Hai Res Inst, AI Res Ctr, Taipei 114699, Taiwan
[3] Oxford Univ Clin Res Unit OUCRU, Ho Chi Minh City, Vietnam
关键词
Artificial intelligence; Large language model; Low-resources languages; Health communication and promotion; Data privacy and security; Health equity;
D O I
10.1016/j.cmpb.2025.108655
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background: The reported study illustrates a methodology for compiling training datasets to fine-tune Large Language Models (LLMs) for healthcare information in Vietnamese, a low-resource language. The objective is to bridge the gap in medical information accessibility and enhance healthcare communication in developing countries by adapting LLMs to specific linguistic nuances and domain needs. Method: The methodology involves selecting a base model, compiling a domain-specific dataset, and fine-tuning the model with this dataset. Three open-source models were selected. The dataset, comprising approximately 337,000 prompt-response pairs in Vietnamese, was compiled using existing datasets, data crawled from Vietnamese medical online forums, and distilled from Vietnamese medical textbooks. The three models were finetuned using the Low-Rank adaptation (LoRA) and Quantized Low-Rank adaptation (QLoRA) techniques. Models' performances were evaluated using BertScore score, Rouge-L score, and the "LLM-as-a-Judge" method. Results: The fine-tuned models showed enhancements in performance over their base versions across evaluation metrics in BertScore score, Rouge-L score and "LLM-as-a-Judge" method, confirming the effectiveness of the finetuning process. This study details the process of fine-tuning open-source LLMs for health information inquiries in Vietnamese, demonstrating its potential to improve healthcare communication in low-resource languages. Deploying the fine-tuned LLM on-premise enhances data privacy and security. However, the significant computing power and costs required pose challenges, especially for organizations in developing countries. Conclusion: This case study highlights the unique challenges faced by developing countries using low-resource languages. Initiatives are needed to emphasize efforts to bridge healthcare gaps in underserved areas and contribute to global health equity.
引用
收藏
页数:11
相关论文
共 28 条
  • [21] VTT-LLM: Advancing Vulnerability-to-Tactic-and-Technique Mapping through Fine-Tuning of Large Language Model
    Zhang, Chenhui
    Wang, Le
    Fan, Dunqiu
    Zhu, Junyi
    Zhou, Tang
    Zeng, Liyi
    Li, Zhaohua
    MATHEMATICS, 2024, 12 (09)
  • [22] BERT4ST:: Fine-tuning pre-trained large language model for wind power forecasting
    Lai, Zefeng
    Wu, Tangjie
    Fei, Xihong
    Ling, Qiang
    ENERGY CONVERSION AND MANAGEMENT, 2024, 307
  • [23] Evaluating Performance of LLaMA2 Large Language Model Enhanced by QLoRA Fine-Tuning for English Grammatical Error Correction
    An, Jing
    Bai, Yanbing
    Li, Jiyi
    Hu, Junjie
    Li, Rui
    Xia, Yuxi
    Hua, Rui
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT I, DEXA 2024, 2024, 14910 : 194 - 206
  • [24] Large language model for interpreting research policy using adaptive two-stage retrieval augmented fine-tuning method
    Ren, Runtao
    Ma, Jian
    Zheng, Zhimin
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 278
  • [25] Artificial intelligence-based data extraction for next generation risk assessment: Is fine-tuning of a large language model worth the effort?
    Sonnenburg, Anna
    van der Lugt, Benthe
    Rehn, Johannes
    Wittkowski, Paul
    Bech, Karsten
    Padberg, Florian
    Eleftheriadou, Dimitra
    Dobrikov, Todor
    Bouwmeester, Hans
    Mereu, Carla
    Graf, Ferdinand
    Kneuer, Carsten
    Kramer, Nynke I.
    Bluemmel, Tilmann
    TOXICOLOGY, 2024, 508
  • [26] Pixel-level spectral aflatoxin B1 content intelligent prediction via fine-tuning large language model (LLM)
    Zhu, Hongfei
    Zhao, Yifan
    Zhao, Longgang
    Yang, Ranbing
    Han, Zhongzhi
    FOOD CONTROL, 2025, 171
  • [27] Fine-tuning a local LLaMA-3 large language model for automated privacy-preserving physician letter generation in radiation oncology
    Hou, Yihao
    Bert, Christoph
    Gomaa, Ahmed
    Lahmer, Godehard
    Hoefler, Daniel
    Weissmann, Thomas
    Voigt, Raphaela
    Schubert, Philipp
    Schmitter, Charlotte
    Depardon, Alina
    Semrau, Sabine
    Maier, Andreas
    Fietkau, Rainer
    Huang, Yixing
    Putz, Florian
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2025, 7
  • [28] What did the occupant say? Fine-tuning and evaluating a large language model for efficient analysis of multi-domain indoor environmental quality feedback
    Sadick, Abdul-Manan
    Chinazzo, Giorgia
    BUILDING AND ENVIRONMENT, 2025, 274