AN HYBRID APPROACH TO FEATURE SELECTION FOR MIXED CATEGORICAL AND CONTINUOUS DATA

被引:0
作者
Doquire, Gauthier [1 ]
Verleysen, Michel [1 ]
机构
[1] Catholic Univ Louvain, Machine Learning Grp, ICTEAM Inst, Pl Levant 3, B-1348 Louvain La Neuve, Belgium
来源
KDIR 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL | 2011年
关键词
Feature selection; Categorical features; Continuous features; Mutual information; MUTUAL INFORMATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an algorithm for feature selection in the case of mixed data. It consists in ranking independently the categorical and the continuous features before recombining them according to the accuracy of a classifier. The popular mutual information criterion is used in both ranking procedures. The proposed algorithm thus avoids the use of any similarity measure between samples described by continuous and categorical attributes, which can be unadapted to many real-world problems. It is able to effectively detect the most useful features of each type and its effectiveness is experimentally demonstrated on four real-world data sets.
引用
收藏
页码:394 / 401
页数:8
相关论文
共 19 条
[1]  
[Anonymous], 1987, PROBL INFORM TRANSM
[2]  
[Anonymous], 1961, Adaptive Control Processes: a Guided Tour, DOI DOI 10.1515/9781400874668
[3]  
[Anonymous], 2004, PHYS REV E
[4]  
[Anonymous], 2007, Uci machine learning repository
[5]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[6]  
Boriah S., 2008, P 8 SIAM INT C DAT M, P243, DOI DOI 10.1137/1.9781611972788.22
[7]  
Fleuret F, 2004, J MACH LEARN RES, V5, P1531
[8]   Information-theoretic feature selection for functional data classification [J].
Gomez-Verdejo, Vanessa ;
Verleysen, Michel ;
Fleury, Jerome .
NEUROCOMPUTING, 2009, 72 (16-18) :3580-3589
[9]  
Guyon I., 2003, Journal of Machine Learning Research, V3, P1157, DOI 10.1162/153244303322753616
[10]  
Hall M. A., 2000, P 17 INT C MACH LEAR, P359, DOI DOI 10.5555/645529.657793