Invariant optimal feature selection: A distance discriminant and feature ranking based solution

被引:71
作者
Liang, Jianning [1 ]
Yang, Su [1 ]
Winstanley, Adam [2 ]
机构
[1] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Dept Comp Sci & Engn, Shanghai 200433, Peoples R China
[2] Natl Univ Ireland, Natl Ctr Geocomputat, Dept Comp Sci, Maynooth, Kildare, Ireland
基金
中国国家自然科学基金;
关键词
optimal feature selection; distance discriminant; feature ranking;
D O I
10.1016/j.patcog.2007.10.018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of feature selection is to find the optimal subset consisting of m features chosen from the total it features. One critical problem for many feature selection methods is that an exhaustive search strategy has to be applied to seek the best subset among all the possible ((n)(m)) feature subsets, which usually results in a considerably high computational complexity. The alternative suboptimal feature selection methods provide more practical solutions in terms of computational complexity but they cannot promise that the finally selected feature subset is globally optimal. We propose a new feature selection algorithm based on a distance discriminant (FSDD), which not only solves the problem of the high computational costs but also overcomes the drawbacks of the suboptimal methods. The proposed method is able to find the optimal feature subset without exhaustive search or Branch and Bound algorithm. The most difficult problem for optimal feature selection, the search problem, is converted into a feature ranking problem following rigorous theoretical proof such that the computational complexity can be greatly reduced. The proposed method is invariant to the linear transformation of data when a diagonal transformation matrix is applied. FSDD was compared with ReliefF and mrmrMID based on mutual information on 8 data sets. The experiment results show that FSDD outperforms the other two methods and is highly efficient. (c) 2007 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1429 / 1439
页数:11
相关论文
共 25 条
[1]   New fast algorithms for error rate-based stepwise variable selection in discriminant analysis [J].
Aeberhard, S ;
De Vel, OY ;
Coomans, DH .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2000, 22 (03) :1036-1052
[2]  
CHOW TWS, 2005, IEEE T NEURAL NETWOR, V16
[3]  
Dash M., 1997, Intelligent Data Analysis, V1
[4]  
Dy JG, 2004, J MACH LEARN RES, V5, P845
[5]  
ESPOSTITO F, 1997, IEEE T PATTERN ANAL, V19
[6]  
FRIEDMAN M, 1999, INTRO PATTERN RECOGN, P143
[7]  
GILADBACHRAC R, 2004, P 21 INT C MACH LEAR
[8]  
Guyon I, 2003, J MACH LEARN RES, P1157, DOI [10.1016/j.aca.2011.07.027, DOI 10.1016/J.ACA.2011.07.027]
[9]  
JAIN AK, 2000, IEEE T PATTERN MACH, V22
[10]  
Jain Anil, 1997, IEEE T PATTERN ANAL, V19