A Combination of Multiple Imputation and Principal Component Analysis to Handle Missing Value with Arbitrary Pattern

被引:0
作者
Anindita, Novita [1 ]
Nugroho, Hanung Adi [1 ]
Adji, Teguh Bharata [1 ]
机构
[1] Univ Gadjah Mada, Fac Engn, Dept Elect Engn & Informat Technol, Yogyakarta, Indonesia
来源
2017 7TH INTERNATIONAL ANNUAL ENGINEERING SEMINAR (INAES) | 2017年
关键词
Multiple Imputation; Markov Chain Monte Carlo; Fully Conditional Specification; Principal Component Analysis; SELECTION; VIRUS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Hepatitis is one of the major health problems which can progress to chronic hepatitis and cancer. Currently, computer based diagnosis is commonly use among medical examination. The diagnosis has been examined by using the disease dataset as a reference to make the decisions. However, the dataset was incomplete because it contained many instances containing missing values. This situation can lead the results of the analysis to be biased. One method of handling missing values is Multiple Imputation. Hepatitis dataset has an arbitrary pattern of missing values. This pattern can be handled by using Markov Chain Monte Carlo (MCMC) and Fully Conditional Specification (FCS) as Multiple Imputation algorithms. The research conducted an experiment to compare combinations of Multiple Imputations algorithm and Principal Component Analysis (PCA) as instance selection. Instance selection applied to reduce data by selecting variables that contribute greatly to the dataset. The goal was to improve the accuracy of the analysis on data which had missing values with the arbitrary pattern. The results showed that FCS-PCA is the best performance with the higher accuracy (98.80%) and the lowest error rate (0.0116).
引用
收藏
页码:1 / 5
页数:5
相关论文
共 18 条
[1]  
Dmitrovic L. Gotal, 2016, INFORMATOLOGIA, V49, P3
[2]   Diverse roles of hepatitis B virus in liver cancer [J].
Fallot, Guillaume ;
Neuveut, Christine ;
Buendia, Marie-Annick .
CURRENT OPINION IN VIROLOGY, 2012, 2 (04) :467-473
[3]  
Glas C. A. W., 2010, MISSING DATA A2 PETT, P283
[4]   Sexual Transmission of Viral Hepatitis [J].
Gorgos, Linda .
INFECTIOUS DISEASE CLINICS OF NORTH AMERICA, 2013, 27 (04) :811-+
[5]  
Hall/CRC, 2014, HDB MISS DAT METH
[6]  
Hall G. F., 2007, ETHN DIS S2, V17
[7]  
Hecht-Nielsen R., 1987, IEEE First International Conference on Neural Networks, P11
[8]   Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port [J].
Ho, P ;
Silva, MCM ;
Hogg, TA .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2001, 55 (1-2) :1-11
[9]  
Jolliffe I., 2002, Encyclopedia of Statistics in Behavioral Science
[10]  
Lichman, 2013, UCI MACHINE LEARNING