Enhancing severe hypoglycemia prediction in type 2 diabetes mellitus through multi-view co-training machine learning model for imbalanced dataset

被引:2
作者
Agraz, Melih [1 ,2 ,6 ]
Deng, Yixiang [4 ,5 ]
Karniadakis, George Em [1 ,3 ]
Mantzoros, Christos Socrates [6 ]
机构
[1] Brown Univ, Div Appl Math, Providence, RI 02912 USA
[2] Giresun Univ, Dept Stat, TR-28200 Giresun, Turkiye
[3] Brown Univ, Sch Engn, Providence, RI 02912 USA
[4] Univ Delaware, Coll Engn, Dept Comp & Informat Sci, Newark, DE 19716 USA
[5] MIT & Harvard, Ragon Inst Mass Gen, Cambridge, MA 02142 USA
[6] Harvard Med Sch, Beth Israel Deaconess Med Ctr, Dept Endocrinol, Boston, MA 02215 USA
关键词
FEATURE-SELECTION; MORTALITY; HEALTH; RISK;
D O I
10.1038/s41598-024-69844-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Patients with type 2 diabetes mellitus (T2DM) who have severe hypoglycemia (SH) poses a considerable risk of long-term death, especially among the elderly, demanding urgent medical attention. Accurate prediction of SH remains challenging due to its multifaced nature, contributed from factors such as medications, lifestyle choices, and metabolic measurements. In this study, we propose a systematic approach to improve the robustness and accuracy of SH predictions using machine learning models, guided by clinical feature selection. Our focus is on developing long-term SH prediction models using both semi-supervised learning and supervised learning algorithms. Using the action to control cardiovascular risk in diabetes trial, which includes electronic health records for over 10,000 individuals, we focus on studying adults with T2DM. Our results indicate that the application of a multi-view co-training method, incorporating the random forest algorithm, improves the specificity of SH prediction, while the same setup with Naive Bayes replacing random forest demonstrates better sensitivity. Our framework also provides interpretability of machine learning models by identifying key predictors for hypoglycemia, including fasting plasma glucose, hemoglobin A1c, general diabetes education, and NPH or L insulins. The integration of data routinely available in electronic health records significantly enhances our model's capability to predict SH events, showcasing its potential to transform clinical practice by facilitating early interventions and optimizing patient management. By enhancing prediction accuracy and identifying crucial predictive features, our study contributes to advancing the understanding and management of hypoglycemia in this population.
引用
收藏
页数:12
相关论文
共 66 条
[1]   Identifying and evaluating clinical subtypes of Alzheimer's disease in care electronic health records using unsupervised machine learning [J].
Alexander, Nonie ;
Alexander, Daniel C. ;
Barkhof, Frederik ;
Denaxas, Spiros .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)
[2]   Consensus and majority vote feature selection methods and a detection technique for web phishing [J].
Alotaibi, Bandar ;
Alotaibi, Munif .
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (01) :717-727
[3]  
Anderson Jeffrey P, 2015, J Diabetes Sci Technol, V10, P6, DOI 10.1177/1932296815620200
[4]  
[Anonymous], 1993, NEW ENGL J MED, V329, P977, DOI [10.1056/NEJM199309303291401, DOI 10.1056/NEJM199309303291401]
[5]  
[Anonymous], 2007, P 24 INT C MACH LEAR, DOI DOI 10.1145/1273496.1273500
[6]   Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging [J].
Azizi, Shekoofeh ;
Culp, Laura ;
Freyberg, Jan ;
Mustafa, Basil ;
Baur, Sebastien ;
Kornblith, Simon ;
Chen, Ting ;
Tomasev, Nenad ;
Mitrovic, Jovana ;
Strachan, Patricia ;
Mahdavi, S. Sara ;
Wulczyn, Ellery ;
Babenko, Boris ;
Walker, Megan ;
Loh, Aaron ;
Chen, Po-Hsuan Cameron ;
Liu, Yuan ;
Bavishi, Pinal ;
McKinney, Scott Mayer ;
Winkens, Jim ;
Roy, Abhijit Guha ;
Beaver, Zach ;
Ryan, Fiona ;
Krogue, Justin ;
Etemadi, Mozziyar ;
Telang, Umesh ;
Liu, Yun ;
Peng, Lily ;
Corrado, Greg S. ;
Webster, Dale R. ;
Fleet, David ;
Hinton, Geoffrey ;
Houlsby, Neil ;
Karthikesalingam, Alan ;
Norouzi, Mohammad ;
Natarajan, Vivek .
NATURE BIOMEDICAL ENGINEERING, 2023, 7 (06) :756-+
[7]  
Ballinger B, 2018, AAAI CONF ARTIF INTE, P2079
[8]   Uses of Electronic Health Records for Public Health Surveillance to Advance Public Health [J].
Birkhead, Guthrie S. ;
Klompas, Michael ;
Shah, Nirav R. .
ANNUAL REVIEW OF PUBLIC HEALTH, VOL 36, 2015, 36 :345-359
[9]  
Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32