LSTM-Based Prediction Model for Tuberculosis Among HIV-Infected Patients Using Structured Electronic Medical Records: A Retrospective Machine Learning Study

被引:3
|
作者
Chen, Jingfang [1 ,2 ]
Liu, Linlin [3 ]
Huang, Junxiong [1 ]
Jiang, Youli [4 ]
Yin, Chengliang [1 ]
Zhang, Lukun [5 ]
Li, Zhihuan [1 ]
Lu, Hongzhou [1 ,5 ]
机构
[1] Macau Univ Sci & Technol, Fac Med, Taipa 999078, Macau, Peoples R China
[2] Third Peoples Hosp Shenzhen, Dept Res & Teaching, Shenzhen 518112, Peoples R China
[3] Univ South China, Sch Nursing, Hengyang Med Sch, Hengyang 421001, Peoples R China
[4] Peoples Hosp Longhua, Dept Neurol, Shenzhen 518109, Peoples R China
[5] Third Peoples Hosp Shenzhen, Natl Clin Res Ctr Infect Dis, Dept Infect Dis, Shenzhen 518112, Peoples R China
来源
JOURNAL OF MULTIDISCIPLINARY HEALTHCARE | 2024年 / 17卷
关键词
Prediction models; HIV; Tuberculosis; Machine Learning; Artificial Intelligence; CHINA;
D O I
10.2147/JMDH.S467877
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: Both HIV and TB are chronic infectious diseases requiring long-term treatment and follow-up, resulting in extensive electronic medical records. With the exponential growth of health and medical big data, effectively extracting and analyzing these data has become the research hotspot. As a fundamental aspect of artificial intelligence, machine learning has been extensively applied in medical research, encompassing diagnosis, treatment, patient monitoring, drug development, and epidemiological investigations. This significantly enhances medical information systems and facilitates the interoperability of medical data. Methods: In our study, we analyzed longitudinal data from the electronic health records of 4540 patients, gathered from the National Clinical Research Center for Infectious Diseases in Shenzhen, China, spanning from 2017 to 2021. Initially, we employed the finetuned ChatGLM to structure the electronic medical records. Subsequently, we utilized a multi-layer perceptron to classify each patient and determined the presence of tuberculosis in HIV patients. Using machine learning-based natural language processing, we structured these records to build a specialized database for HIV and TB co-infection. We studied the epidemiological characteristics, focusing on incidence patterns, patient characteristics, and influencing factors, to uncover the transmission characteristics of these diseases in Shenzhen. Additionally, we used Long Short-Term Memory to create a predictive model for TB co-infection among HIV patients, based on their medical records. This model predicted the risk of TB co-infection, providing scientific evidence for clinical decision- making and enabling early detection and precise intervention. Results: Based on the refined ChatGLM model tailored for structured electronic health records, the accuracy of symptom extraction consistently surpassed 0.95 precision. Key symptoms such as diarrhea and normal showed precision rates exceeding 0.90. High scores were also achieved in recall and F1 scores. Among 4540 HIV patients, 758 were diagnosed with concurrent tuberculosis, indicating a 16.7% co-infection rate, while syphilis co-infection affected 25.1%, underscoring the prevalence of concurrent infections among HIV patients. Utilizing electronic health records, a Multilayer Perceptron classifier was developed as a benchmark against Long Short-Term Memory to predict high-risk groups for HIV and tuberculosis co-infections. The Multilayer Perceptron classifier demonstrated predictive ability with AUROC values ranging from 0.616 to 0.682 on the test set, suggesting opportunities for further optimization and generalization despite its accuracy in identifying HIV-TB co-infections. In tuberculosis intelligent diagnosis based on laboratory results, the Long Short-Term Memory showed consistent performance across 5-fold cross-validation, with AUROC values ranging from 0.827 to 0.850, indicating reliability and consistency in tuberculosis prediction. Furthermore, by optimizing classification thresholds, the model achieved an overall accuracy of 81.18% in distinguishing HIV co-infected tuberculosis from simple HIV infection. Conclusion: Combining the Multilayer Perceptron classifier with Long Short-Term Memory represented an advanced approach for effectively extracting electronic health records and utilizing it for disease prediction. This underscored the superior performance of deep learning techniques in managing both structured and unstructured medical data. Models leveraging laboratory time-series data demonstrated notably better performance compared to those relying solely on electronic health records for predicting tuberculosis incidence. This emphasized the benefits of deep learning in handling intricate medical data and provided valuable insights for healthcare providers exploring the use of deep learning in disease prediction and management.
引用
收藏
页码:3557 / 3573
页数:17
相关论文
共 32 条
  • [21] Development and Feasibility Study of HOPE Model for Prediction of Depression Among Older Adults Using Wi-Fi-based Motion Sensor Data: Machine Learning Study
    Nejadshamsi, Shayan
    Karami, Vania
    Ghourchian, Negar
    Armanfard, Narges
    Bergman, Howard
    Grad, Roland
    Wilchesky, Machelle
    Khanassov, Vladimir
    Vedel, Isabelle
    Rahimi, Samira Abbasgholizadeh
    JMIR AGING, 2025, 8
  • [22] Predicting the risk of acute kidney injury in patients with acute pancreatitis complicated by sepsis using a stacked ensemble machine learning model: a retrospective study based on the MIMIC database
    Li, Fuyuan
    Wang, Zhanjin
    Bian, Ruiling
    Xue, Zhangtuo
    Cai, Junjie
    Zhou, Ying
    Wang, Zhan
    BMJ OPEN, 2025, 15 (02):
  • [23] Clinical and inflammatory features based machine learning model for fatal risk prediction of hospitalized COVID-19 patients: results from a retrospective cohort study
    Guan, Xin
    Zhang, Bo
    Fu, Ming
    Li, Mengying
    Yuan, Xu
    Zhu, Yaowu
    Peng, Jing
    Guo, Huan
    Lu, Yanjun
    ANNALS OF MEDICINE, 2021, 53 (01) : 257 - 266
  • [24] A machine learning-based prediction model for in-hospital mortality among critically ill patients with hip fracture: An internal and external validated study
    Lei, Mingxing
    Han, Zhencan
    Wang, Shengjie
    Han, Tao
    Fang, Shenyun
    Lin, Feng
    Huang, Tianlong
    INJURY-INTERNATIONAL JOURNAL OF THE CARE OF THE INJURED, 2023, 54 (02): : 636 - 644
  • [25] Developing a rapid screening tool for high-risk ICU patients of sepsis: integrating electronic medical records with machine learning methods for mortality prediction in hospitalized patients-model establishment, internal and external validation, and visualization
    Shi, Songchang
    Zhang, Lihui
    Zhang, Shujuan
    Shi, Jinyang
    Hong, Donghuang
    Wu, Siqi
    Pan, Xiaobin
    Lin, Wei
    JOURNAL OF TRANSLATIONAL MEDICINE, 2025, 23 (01)
  • [26] Evaluation of the three-in-one team-based care model on hierarchical diagnosis and treatment patterns among patients with diabetes: a retrospective cohort study using Xiamen's regional electronic health records
    Li, Xuejun
    Li, Zhibin
    Liu, Changqin
    Zhang, Junfeng
    Sun, Zhonghai
    Feng, Yuji
    Mei, Jing
    Gu, Chengming
    Li, Xiaoying
    Yang, Shuyu
    BMC HEALTH SERVICES RESEARCH, 2017, 17
  • [27] Developing a short-term prediction model for asthma exacerbations from Swedish primary care patients' data using machine learning - Based on the ARCTIC study
    Lisspers, Karin
    Stallberg, Bjorn
    Larsson, Kjell
    Janson, Christer
    Muller, Mario
    Luczko, Mateusz
    Bjerregaard, Bine Kjoller
    Bacher, Gerald
    Holzhauer, Bjorn
    Goyal, Pankaj
    Johansson, Gunnar
    RESPIRATORY MEDICINE, 2021, 185
  • [28] Development of a machine learning-based prediction model for extremely rapid decline in estimated glomerular filtration rate in patients with chronic kidney disease: a retrospective cohort study using a large data set from a hospital in Japan
    Inaguma, Daijo
    Hayashi, Hiroki
    Yanagiya, Ryosuke
    Koseki, Akira
    Iwamori, Toshiya
    Kudo, Michiharu
    Fukuma, Shingo
    Yuzawa, Yukio
    BMJ OPEN, 2022, 12 (06):
  • [29] Explainable machine learning model for prediction of 28-day all-cause mortality in immunocompromised patients in the intensive care unit: a retrospective cohort study based on MIMIC-IV database
    Zhengqiu Yu
    Lexin Fang
    Yueping Ding
    European Journal of Medical Research, 30 (1)
  • [30] Development of Prediction Model Using Machine-Learning Algorithms for Nonsteroidal Anti-inflammatory Drug-Induced Gastric Ulcer in Osteoarthritis Patients: Retrospective Cohort Study of a Nationwide South Korean Cohort
    Jeong, Jaehan
    Han, Hyein
    Ro, Du Hyun
    Han, Hyuk-Soo
    Won, Sungho
    CLINICS IN ORTHOPEDIC SURGERY, 2023, 15 (04) : 678 - 689