On fuzzy-rough attribute selection: Criteria of Max-Dependency, Max-Relevance, Min-Redundancy, and Max-Significance

被引:43
作者
Maji, Pradipta [1 ]
Garai, Partha [1 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700108, W Bengal, India
关键词
Pattern recognition; Data mining; Attribute selection; Rough sets; Classification; MUTUAL INFORMATION; CLUSTERING-ALGORITHM; CLASSIFICATION; REDUCTION; SETS;
D O I
10.1016/j.asoc.2012.09.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Attribute selection is one of the important problems encountered in pattern recognition, machine learning, data mining, and bioinformatics. It refers to the problem of selecting those input attributes or features that are most effective to predict the sample categories. In this regard, rough set theory has been shown to be successful for selecting relevant and nonredundant attributes from a given data set. However, the classical rough sets are unable to handle real valued noisy features. This problem can be addressed by the fuzzy-rough sets, which are the generalization of classical rough sets. A feature selection method is presented here based on fuzzy-rough sets by maximizing both relevance and significance of the selected features. This paper also presents different feature evaluation criteria such as dependency, relevance, redundancy, and significance for attribute selection task using fuzzy-rough sets. The performance of different rough set models is compared with that of some existing feature evaluation indices based on the predictive accuracy of nearest neighbor rule, support vector machine, and decision tree. The effectiveness of the fuzzy-rough set based attribute selection method, along with a comparison with existing feature evaluation indices and different rough set models, is demonstrated on a set of benchmark and microarray gene expression data sets. (C) 2012 Elsevier B. V. All rights reserved.
引用
收藏
页码:3968 / 3980
页数:13
相关论文
共 48 条
[1]  
ASPIN AA, 1949, BIOMETRIKA, V36, P290, DOI 10.1093/biomet/36.3-4.290
[2]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[3]   Some new indexes of cluster validity [J].
Bezdek, JC ;
Pal, NR .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1998, 28 (03) :301-315
[4]   On the selection and classification of independent features [J].
Bressan, M ;
Vitrià, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2003, 25 (10) :1312-1317
[5]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[6]   Rough set-aided keyword reduction for text categorization [J].
Chouchoulas, A ;
Shen, Q .
APPLIED ARTIFICIAL INTELLIGENCE, 2001, 15 (09) :843-873
[7]   FEATURE SELECTION WITH A LINEAR DEPENDENCE MEASURE [J].
DAS, SK .
IEEE TRANSACTIONS ON COMPUTERS, 1971, C 20 (09) :1106-&
[8]   Consistency-based search in feature selection [J].
Dash, M ;
Liu, HA .
ARTIFICIAL INTELLIGENCE, 2003, 151 (1-2) :155-176
[9]  
Dash M., 2000, Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications, PADKK'00, P110
[10]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227