Early temporal prediction of Type 2 Diabetes Risk Condition from a General Practitioner Electronic Health Record: A Multiple Instance Boosting Approach

被引:33
作者
Bernardini, Michele [1 ]
Morettini, Micaela [1 ]
Romeo, Luca [1 ,2 ]
Frontoni, Emanuele [1 ]
Burattini, Laura [1 ]
机构
[1] Univ Politecn Marche, Dept Informat Engn DII, Ancona, Italy
[2] Ist Italiano Tecnol, Cognit Mot & Neurosci & Computat Stat & Machine L, Genoa, Italy
关键词
Type; 2; Diabetes; Machine Learning; Predictive Medicine; Temporal Analysis; Electronic Health Record; Clinical Decision Support System; INSULIN-RESISTANCE; FASTING GLUCOSE; CLASSIFICATION; TRIGLYCERIDES; PRODUCT;
D O I
10.1016/j.artmed.2020.101847
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Early prediction of target patients at high risk of developing Type 2 diabetes (T2D) plays a significant role in preventing the onset of overt disease and its associated comorbidities. Although fundamental in early phases of T2D natural history, insulin resistance is not usually quantified by General Practitioners (GPs). Triglyceride-glucose (TyG) index has been proven useful in clinical studies for quantifying insulin resistance and for the early identification of individuals at T2D risk but still not applied by GPs for diagnostic purposes. The aim of this study is to propose a multiple instance learning boosting algorithm (MIL-Boost) for creating a predictive model capable of early prediction of worsening insulin resistance (low vs high T2D risk) in terms of TyG index. The MIL-Boost is applied to past electronic health record (EHR) patients' information stored by a single GP. The proposed MIL-Boost algorithm proved to be effective in dealing with this task, by performing better than the other state-of-the-art ML competitors (Recall from 0.70 and up to 0.83). The proposed MIL-based approach is able to extract hidden patterns from past EHR temporal data, even not directly exploiting triglycerides and glucose measurements. The major advantages of our method can be found in its ability to model the temporal evolution of longitudinal EHR data while dealing with small sample size and variability in the observations (e.g., a small variable number of prescriptions for non-hospitalized patients). The proposed algorithm may represent the main core of a clinical decision support system.
引用
收藏
页数:11
相关论文
共 54 条
[1]   A Systematic Review of Biomarkers and Risk of Incident Type 2 Diabetes: An Overview of Epidemiological, Prediction and Aetiological Research Literature [J].
Abbasi, Ali ;
Sahlqvist, Anna-Stina ;
Lotta, Luca ;
Brosnan, Julia M. ;
Vollenweider, Peter ;
Giabbanelli, Philippe ;
Nunez, Derek J. ;
Waterworth, Dawn ;
Scott, Robert A. ;
Langenberg, Claudia ;
Wareham, Nicholas J. .
PLOS ONE, 2016, 11 (10)
[2]   Electronic health record phenotyping improves detection and screening of type 2 diabetes in the general United States population: A cross-sectional, unselected, retrospective study [J].
Anderson, Ariana E. ;
Kerr, Wesley T. ;
Thames, April ;
Li, Tong ;
Xiao, Jiayang ;
Cohen, Mark S. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 60 :162-168
[3]  
[Anonymous], AMIA ANN S P
[4]  
[Anonymous], 2016, MULTIPLE INSTANCE LE
[5]  
[Anonymous], 2003, ADV NEURAL INFORM PR
[6]  
[Anonymous], DIABETES METAB
[7]   Robust Object Tracking with Online Multiple Instance Learning [J].
Babenko, Boris ;
Yang, Ming-Hsuan ;
Belongie, Serge .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (08) :1619-1632
[8]   Nearest neighbor imputation algorithms: a critical evaluation [J].
Beretta, Lorenzo ;
Santaniello, Alessandro .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2016, 16
[9]   TyG-er: An ensemble Regression Forest approach for identification of clinical factors related to insulin resistance condition using Electronic Health Records [J].
Bernardini, Michele ;
Morettini, Micaela ;
Romeo, Luca ;
Frontoni, Emanuele ;
Burattini, Laura .
COMPUTERS IN BIOLOGY AND MEDICINE, 2019, 112
[10]   Discovering the Type 2 Diabetes in Electronic Health Records Using the Sparse Balanced Support Vector Machine [J].
Bernardini, Michele ;
Romeo, Luca ;
Misericordia, Paolo ;
Frontoni, Emanuele .
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2020, 24 (01) :235-246