Optimum estimation of missing values in randomized complete block design by genetic algorithm

被引:17
作者
Azadeh, A. [1 ,2 ]
Asadzadeh, S. M. [1 ,2 ]
Jafari-Marandi, R. [1 ,2 ]
Nazari-Shirkouhi, S. [1 ,2 ]
Khoshkhou, G. Baharian [3 ]
Talebi, S. [4 ]
Naghavi, A. [1 ,2 ]
机构
[1] Univ Tehran, Dept Ind Engn, Ctr Excellence Intelligent Expt Mech, Tehran 14174, Iran
[2] Univ Tehran, Dept Engn Optimizat Res, Coll Engn, Tehran 14174, Iran
[3] Univ Illinois, Dept Mech & Ind Engn, Urbana, IL 61801 USA
[4] N Carolina State Univ, Dept Ind Engn, Raleigh, NC 27695 USA
关键词
Missing values; Genetic algorithm (GA); Artificial Neural Network (ANN); Particle swarm optimization (PSO); Regression methods; Complete randomized block design; MULTIPLE IMPUTATION; PREPROCESSING METHOD; NEURO-FUZZY; CLASSIFICATION; LIKELIHOOD; DISCRETE; OUTLIERS; DEAL;
D O I
10.1016/j.knosys.2012.06.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Missing data are a part of almost all research, and we all have to decide how to deal with it from time to time. There are a number of alternative ways of dealing with missing data. The problem of handling missing data has been treated adequately in various real world data sets. Several statistical methods have been developed since the early 1970s, when the manipulation of complicated numerical calculations became feasible with the advancement of computers. The purpose of this research is to estimate missing values by using genetic algorithm (GA) approach in a randomized complete block design (RCBD) table and to compare the computational results with three other methods, namely, particle swarm optimization (PSO), Artificial Neural Network (ANN), approximate analysis and exact regression method. Furthermore, 30 independent experiments were conducted to estimate missing values in 30 RCBD tables by GA, PSO, ANN, exact regression and approximate analysis methods. Computational results indicated that the best answer (in the last 10-chromosome population) obtained by GA is frequently the same as the missing value, with the mean value being close to the missing observation. It is concluded that GA provides much better estimation than the other methods. The superiority of GA is shown through lower error estimations and also Pearson correlation experiment. Therefore, it is suggested to utilize GA approach of this study for estimating missing values for RCBD. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:37 / 47
页数:11
相关论文
共 68 条
[1]   Missing data: a review of current methods and applications in epidemiological research [J].
Abraham, WT ;
Russell, DW .
CURRENT OPINION IN PSYCHIATRY, 2004, 17 (04) :315-321
[2]   MISSING OBSERVATIONS IN MULTIVARIATE STATISTICS .1. IEW OF LITERATURE [J].
AFIFI, AA ;
ELASHOFF, RM .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1966, 61 (315) :595-&
[3]   A consistency-based procedure to estimate missing pairwise preference values [J].
Alonso, S. ;
Chiclana, F. ;
Herrera, F. ;
Herrera-Viedma, E. ;
Alcala-Fdez, J. ;
Porcel, C. .
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2008, 23 (02) :155-175
[4]  
[Anonymous], 1986, FOUNDATIONS, DOI DOI 10.7551/MITPRESS/5236.001.0001
[5]  
Azadeh A., 2008, P 2008 I IND ENG IIE
[6]   Multiple imputation by chained equations: what is it and how does it work? [J].
Azur, Melissa J. ;
Stuart, Elizabeth A. ;
Frangakis, Constantine ;
Leaf, Philip J. .
INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2011, 20 (01) :40-49
[7]  
Bashir S., 2006, WSEAS Transactions on Computers, V5, P2388
[8]   Improving cluster-based missing value estimation of DNA microarray data [J].
Bras, Ligia P. ;
Menezes, Jose C. .
BIOMOLECULAR ENGINEERING, 2007, 24 (02) :273-282
[9]  
Calders T, 2007, APPLIED COMPUTING 2007, VOL 1 AND 2, P404, DOI 10.1145/1244002.1244097
[10]   Dealing with missing software project data [J].
Cartwright, MH ;
Shepperd, MJ ;
Song, Q .
NINTH INTERNATIONAL SOFTWARE METRICS SYMPOSIUM, PROCEEDINGS, 2003, :154-165