A selective Bayes Classifier for classifying incomplete data based on gain ratio

被引:26
作者
Chen, Jingnian [1 ,2 ]
Huang, Houkuan [1 ]
Tian, Fengzhan [1 ]
Tian, Shengfeng [1 ]
机构
[1] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing 100044, Peoples R China
[2] Shandong Univ Finance, Dept Informat & Comp Sci, Jinan 250014, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Bayesian Classifiers; Feature selection; Incomplete data; Gain ratio;
D O I
10.1016/j.knosys.2008.03.013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Actual data sets are often incomplete because of various kinds of reasons. Although numerous algorithms about classification have been proposed, most of them deal with complete data. So methods of constructing classifiers for incomplete data deserve more attention. By analyzing main methods of processing incomplete data for classification. this paper presents a selective Bayes Classifier for classifying incomplete data with a simpler formula for computing gain ratio. The proposed algorithm needs no assumption about data sets that are necessary for previous methods of processing incomplete data in classification. Experiments on 12 benchmark incomplete data sets show that this method can greatly improve the accuracy of classification. Furthermore, it can sharply reduce the number of attributes and so can greatly simplify the data sets and classifiers. (c) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:530 / 534
页数:5
相关论文
共 13 条
[1]  
[Anonymous], 1998, UCI REPOSITORY MACHI
[2]  
[Anonymous], 2005, Data Mining Pratical Machine Learning Tools and Techniques
[3]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[4]  
Duda R., 1973, PATTERN RECOGN
[5]   STOCHASTIC RELAXATION, GIBBS DISTRIBUTIONS, AND THE BAYESIAN RESTORATION OF IMAGES [J].
GEMAN, S ;
GEMAN, D .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1984, 6 (06) :721-741
[6]  
Kohavi R., 1997, ECML 97 CHARL U PRAG, P78
[7]  
Little R., 1987, STAT ANAL MISSING DA
[8]  
Quinlan J. R., 2014, C4 5 PROGRAMS MACHIN
[9]   Robust Bayes classifiers [J].
Ramoni, M ;
Sebastiani, P .
ARTIFICIAL INTELLIGENCE, 2001, 125 (1-2) :209-226
[10]  
RUSSELL S, 1995, P 14 INT JOINT C ART, P1146