Cleansing and Imputation of Body Mass Index Data and Its Impact on a Machine Learning Based Prediction Model

被引:5
作者
Jauk, Stefanie [1 ]
Kramer, Diether [2 ]
Leodolter, Werner [2 ]
机构
[1] CBmed, Graz, Austria
[2] Steiermark Krankenanstaltengesell mbH KAGeS, Graz, Austria
来源
HEALTH INFORMATICS MEETS EHEALTH: BIOMEDICAL MEETS EHEALTH - FROM SENSORS TO DECISIONS | 2018年 / 248卷
关键词
Electronic health records; body mass index; machine learning; data imputation; data cleansing; predictive modelling; RISK; DELIRIUM;
D O I
10.3233/978-1-61499-858-7-116
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Background: A challenge of using electronic health records for secondary analyses is data quality. Body mass index (BMI) is an important predictor for various diseases but often not documented properly. Objectives: The aim of our study is to perform data cleansing on BMI values and to find the best method for an imputation of missing values in order to increase data quality. Further, we want to assess the effect of changes in data quality on the performance of a prediction model based on machine learning. Methods: After data cleansing on BMI data, we compared machine learning methods and statistical methods in their accuracy of imputed values using the root mean square error. In a second step, we used three variations of BMI data as a training set for a model predicting the occurrence of delirium. Results: Neural network and linear regression models performed best for imputation. There were no changes in model performance for different BMI input data. Conclusion: Although data quality issues may lead to biases, it does not always affect performance of secondary analyses.
引用
收藏
页码:116 / 123
页数:8
相关论文
共 17 条
[1]  
[Anonymous], 2013, Applied Predictive Modeling, DOI DOI 10.1007/978-1-4614-6849-3
[2]  
Cruz JA, 2006, CANCER INFORM, V2, P59
[3]   Body mass index as a predictor of fracture risk:: A meta-analysis [J].
De Laet, C ;
Kanis, JA ;
Odén, A ;
Johanson, H ;
Johnell, O ;
Delmas, P ;
Eisman, JA ;
Kroger, H ;
Fujiwara, S ;
Garnero, P ;
McCloskey, EV ;
Mellstrom, D ;
Melton, LJ ;
Meunier, PJ ;
Pols, HAP ;
Reeve, J ;
Silman, A ;
Tenenhouse, A .
OSTEOPOROSIS INTERNATIONAL, 2005, 16 (11) :1330-1338
[4]   Validity of the WHO cutoffs for biologically implausible values of weight, height, and BMI in children and adolescents in NHANES from 1999 through 2012 [J].
Freedman, David S. ;
Lawman, Hannah G. ;
Skinner, Asheley C. ;
McGuire, Lisa C. ;
Allison, David B. ;
Ogden, Cynthia L. .
AMERICAN JOURNAL OF CLINICAL NUTRITION, 2015, 102 (05) :1000-1006
[5]   Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review [J].
Goldstein, Benjamin A. ;
Navar, Ann Marie ;
Pencina, Michael J. ;
Ioannidis, John P. A. .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2017, 24 (01) :198-208
[6]   Development, Validation and Deployment of a Real Time 30 Day Hospital Readmission Risk Assessment Tool in the Maine Healthcare Information Exchange [J].
Hao, Shiying ;
Wang, Yue ;
Jin, Bo ;
Shin, Andrew Young ;
Zhu, Chunqing ;
Huang, Min ;
Zheng, Le ;
Luo, Jin ;
Hu, Zhongkai ;
Fu, Changlin ;
Dai, Dorothy ;
Wang, Yicheng ;
Culver, Devore S. ;
Alfreds, Shaun T. ;
Rogow, Todd ;
Stearns, Frank ;
Sylvester, Karl G. ;
Widen, Eric ;
Ling, Xuefeng B. .
PLOS ONE, 2015, 10 (10)
[7]   Current concepts - Delirium in older persons [J].
Inouye, SK .
NEW ENGLAND JOURNAL OF MEDICINE, 2006, 354 (11) :1157-1165
[8]   Missing data imputation using statistical and machine learning methods in a real breast cancer problem [J].
Jerez, Jose M. ;
Molina, Ignacio ;
Garcia-Laencina, Pedro J. ;
Alba, Emilio ;
Ribelles, Nuria ;
Martin, Miguel ;
Franco, Leonardo .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2010, 50 (02) :105-115
[9]   Longitudinal multiple imputation approaches for body mass index or other variables with very low individual-level variability: the mibmi command in Stata [J].
Kontopantelis E. ;
Parisi R. ;
Springate D.A. ;
Reeves D. .
BMC Research Notes, 10 (1) :1-21
[10]   Development and Validation of a Multivariable Prediction Model for the Occurrence of Delirium in Hospitalized Gerontopsychiatry and Internal Medicine Patients [J].
Kramer, Diether ;
Veeranki, Sai ;
Hayn, Dieter ;
Quehenberger, Franz ;
Leodolter, Werner ;
Jagsch, Christian ;
Schreier, Guenter .
HEALTH INFORMATICS MEETS EHEALTH: DIGITAL INSIGHT - INFORMATION-DRIVEN HEALTH & CARE, 2017, 236 :32-39