A retrospective study using machine learning to develop predictive model to identify rotavirus-associated acute gastroenteritis in children

被引:0
|
作者
Paul, Sourav [1 ]
Rahman, Minhazur [2 ]
Dolley, Anutee [3 ]
Saikia, Kasturi [3 ]
Singh, Chongtham Shyamsunder [4 ]
Mohammed, Arifullah [5 ]
Muteeb, Ghazala [6 ]
Sarmah, Rosy [7 ]
Namsa, Nima D. [3 ]
机构
[1] Natl Inst Technol, Dept Biotechnol, Durgapur, West Bengal, India
[2] Tezpur Univ, Dept Comp Sci & Engn, Napaam, Assam, India
[3] Tezpur Univ, Dept Mol Biol & Biotechnol, Napaam, Assam, India
[4] Reg Inst Med Sci, Dept Paediat, Imphal, Manipur, India
[5] Univ Malaysia Kelantan, Fac Agrobased Ind, Dept Agr Sci, Kelantan, Malaysia
[6] King Faisal Univ, Coll Appl Med Sci, Dept Nursing, Al Hasa, Saudi Arabia
[7] Tezpur Univ, Dept Comp Sci & Engn, Napaam, Assam, India
来源
PEERJ | 2025年 / 13卷
关键词
Rotavirus; Gastroenteritis; Machine learning; Disease diagnosis; Supervised learning; Child health; ARTIFICIAL-INTELLIGENCE; DIARRHEA; SURVEILLANCE; IDENTIFICATION; INFECTION; DISEASES; BURDEN; IMPACT;
D O I
10.7717/peerj.19025
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background. Rotavirus is the leading cause of severe dehydrating diarrhea in children under 5 years worldwide. Timely diagnosis is critical, but access to confirmatory testing is limited in hospital settings. Machine learning (ML) models have shown promising potential in supporting symptom-based diagnosis of several diseases in resource-limited settings. Objectives. This study aims to develop a machine-learning predictive model integrated with multiple sources of clinical parameters specific to rotavirus infection without relying on laboratory tests. Methods A clinical dataset of 509 children was collected in collaboration with the Regional Institute of Medical Sciences, Imphal, India. The clinical symptoms included diarrhea and its duration, number of stool episodes per day, fever, vomiting and its duration, number of vomiting episodes per day, temperature and dehydration. Correlation analysis is performed to check the feature-feature and feature-outcome collinearity. Feature selection using ANOVA F test is carried out to find the feature importance values and finally obtain the reduced feature subset. Seven supervised learning models were tested and compared viz., support vector machine (SVM), K-nearest neighbor (KNN), naive Bayes (NB), logistic regression (Log_R) , random forest (RF), decision tree (DT), and XGBoost (XGB). A comparison of the performances of the seven models using the classification results obtained. The performance of the models was evaluated based on accuracy, precision, recall, specificity, F1 score, macro F1, F2, and receiver operator characteristic curve. Results. The seven ML models were exhaustively experimented on our dataset and compared based on eight evaluation scores which are accuracy, precision, recall, specificity, F1 score, F2 score, macro F1 score, and AUC values computed. We observed that when the seven ML models were applied, RF performed the best with an accuracy of 81.4%, F1 score of 86.9%, macro F1-score of 77.3%, F2 score of 86.5% and area under the curve (AUC) of 89%. Conclusions. The machine learning models can contribute to predicting symptom-based diagnosis of rotavirus-associated acute gastroenteritis in children, especially in resource-limited settings. Further validation of the models using a large dataset is needed for predicting pediatric diarrheic populations with optimum sensitivity and specificity. /span>
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Machine learning for early diagnosis of Kawasaki disease in acute febrile children: retrospective cross-sectional study in China
    Zheng, Wei
    Zhu, Shiben
    Wang, Xuelian
    Chen, Cuixuan
    Zhen, Zifeng
    Xu, Yi
    Mo, Xiaolan
    Tse, Gary
    Li, Xufang
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [42] Machine learning model identifies aggressive acute pancreatitis within 48 h of admission: a large retrospective study
    Lei Yuan
    Mengyao Ji
    Shuo Wang
    Xinyu Wen
    Pingxiao Huang
    Lei Shen
    Jun Xu
    BMC Medical Informatics and Decision Making, 22
  • [43] Predictive model for acute respiratory distress syndrome events in ICU patients in China using machine learning algorithms: a secondary analysis of a cohort study
    Xian-Fei Ding
    Jin-Bo Li
    Huo-Yan Liang
    Zong-Yu Wang
    Ting-Ting Jiao
    Zhuang Liu
    Liang Yi
    Wei-Shuai Bian
    Shu-Peng Wang
    Xi Zhu
    Tong-Wen Sun
    Journal of Translational Medicine, 17
  • [44] A machine learning model for predicting acute respiratory distress syndrome risk in patients with sepsis using circulating immune cell parameters: a retrospective study
    Kaihuan Zhou
    Lian Qin
    Yin Chen
    Hanming Gao
    Yicong Ling
    Qianqian Qin
    Chenglin Mou
    Tao Qin
    Junyu Lu
    BMC Infectious Diseases, 25 (1)
  • [45] A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation
    Zhen-nan Yuan
    Yu-juan Xue
    Hai-jun Wang
    Shi-ning Qu
    Chu-lin Huang
    Hao Wang
    Hao Zhang
    Min-ze Zhang
    Xue-zhong Xing
    Scientific Reports, 15 (1)
  • [46] Development of a multi-laboratory integrated predictive model for silicosis utilizing machine learning: a retrospective case-control study
    Sun, Guo-kang
    Xiang, Yun-hui
    Wang, Lu
    Xiang, Pin-pin
    Wang, Zi-xin
    Zhang, Jing
    Wu, Ling
    FRONTIERS IN PUBLIC HEALTH, 2025, 12
  • [47] A machine learning-based predictive model for biliary stricture attributable to malignant tumors: a dual-center retrospective study
    Yang, Qifan
    Nie, Lu
    Xu, Jian
    Li, Hua
    Zhu, Xin
    Wei, Mingwei
    Yao, Jun
    FRONTIERS IN ONCOLOGY, 2024, 14
  • [48] Predictive value of red blood cell distribution width in septic shock patients with thrombocytopenia: A retrospective study using machine learning
    Ling, Jianmin
    Liao, Tongzhou
    Wu, Yanqing
    Wang, Zhaohua
    Jin, Hai
    Lu, Feng
    Fang, Minghao
    JOURNAL OF CLINICAL LABORATORY ANALYSIS, 2021, 35 (12)
  • [49] Predictive model of acute kidney injury in critically ill patients with acute pancreatitis: a machine learning approach using the MIMIC-IV database
    Lin, Shengwei
    Lu, Wenbin
    Wang, Ting
    Wang, Ying
    Leng, Xueqian
    Chi, Lidan
    Jin, Peipei
    Bian, Jinjun
    RENAL FAILURE, 2024, 46 (01)
  • [50] Development of a Predictive Model of Occult Cancer After a Venous Thromboembolism Event Using Machine Learning: The CLOVER Study
    Franco-Moreno, Anabel
    Madronal-Cerezo, Elena
    de Ancos-Aracil, Cristina Lucia
    Farfan-Sedano, Ana Isabel
    Munoz-Rivas, Nuria
    Bascunana Morejon-Giron, Jose
    Ruiz-Giardin, Jose Manuel
    alvarez-Rodriguez, Federico
    Prada-Alonso, Jesus
    Gala-Garcia, Yvonne
    Casado-Suela, Miguel angel
    Bustamante-Fermosel, Ana
    Alfaro-Fernandez, Nuria
    Torres-Macho, Juan
    CLOVER Res Grp
    MEDICINA-LITHUANIA, 2025, 61 (01):