Missing data analyses: a hybrid multiple imputation algorithm using Gray System Theory and entropy based on clustering

被引:0
作者
Jing Tian
Bing Yu
Dan Yu
Shilong Ma
机构
[1] Beihang University,State Key Laboratory of Software Development Environment
来源
Applied Intelligence | 2014年 / 40卷
关键词
Missing data; Multiple imputation; Gray System Theory; Entropy; Clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Researchers and practitioners who use databases usually feel that it is cumbersome in knowledge discovery or application development due to the issue of missing data. Though some approaches can work with a certain rate of incomplete data, a large portion of them demands high data quality with completeness. Therefore, a great number of strategies have been designed to process missingness particularly in the way of imputation. Single imputation methods initially succeeded in predicting the missing values for specific types of distributions. Yet, the multiple imputation algorithms have maintained prevalent because of the further promotion of validity by minimizing the bias iteratively and less requirement on prior knowledge to the distributions.
引用
收藏
页码:376 / 388
页数:12
相关论文
共 103 条
[11]  
Menezes JC(2009)Missing data analysis with fuzzy C-means: a study of its application in a psychological scenario AStA Adv Stat Anal 51 5305-5316
[12]  
Calle J(2007)Semiparametric predictive mean matching Comput Stat Data Anal 59 1087-1091
[13]  
Castaño L(2006)Imputation through finite Gaussian mixture models J Clin Epidemiol 19 101-129
[14]  
Castro E(2006)Review: a gentle introduction to imputation of missing values Adv Learn Behav Disabil 72 1483-1493
[15]  
Cuadra D(2009)Modern alternatives for dealing with missing data in special education research Neurocomputing 27 1468-1474
[16]  
Chen SM(2011)K nearest neighbours with mutual information for simultaneous classification and missing data imputation Comput Hum Behav 49 791-796
[17]  
Chen HH(2008)Missing data imputation in multivariate data by evolutionary algorithms Stat Pap 31 735-744
[18]  
Chen SM(2001)An improved estimator to analyse missing data IEEE Trans Syst Man Cybern, Part B, Cybern 11 173-183
[19]  
Huang CM(2011)Fuzzy C-means clustering of incomplete data J Comput Methods Sci Eng 20 239-252
[20]  
Deng JL(2004)A Bayesian imputation method for a clustering genetic algorithm Appl Intell 25 243-251