Benefiting feature selection by the discovery of false irrelevant attributes

被引：1

作者：

Chao, Lidia S. ^{[1
]}

Wong, Derek F. ^{[1
]}

Chen, Philip C. L. ^{[1
]}

Ng, Wing W. Y. ^{[2
]}

Yeung, Daniel S. ^{[2
]}

机构：

[1] Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China

[2] S China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510000, Guangdong, Peoples R China

来源：

INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING | 2015年 / 13卷 / 04期

关键词：

Supportive relevance; hidden interaction; data preprocessing; feature selection; data mining; MUTUAL INFORMATION; MICROARRAY DATA; CLASSIFICATION; RELEVANCE;

D O I：

10.1142/S021969131550023X

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

The ordinary feature selection methods select only the explicit relevant attributes by filtering the irrelevant ones. They trade the selection accuracy for the execution time and complexity. In which, the hidden supportive information possessed by the irrelevant attributes may be lost, so that they may miss some good combinations. We believe that attributes are useless regarding the classification task by themselves, sometimes may provide potentially useful supportive information to other attributes and thus benefit the classification task. Such a strategy can minimize the information lost, therefore is able to maximize the classification accuracy. Especially for the dataset contains hidden interactions among attributes. This paper proposes a feature selection methodology from a new angle that selects not only the relevant features, but also targeting at the potentially useful false irrelevant attributes by measuring their supportive importance to other attributes. The empirical results validate the hypothesis by demonstrating that the proposed approach outperforms most of the state-of-the-art filter based feature selection methods.

引用

页数：17

共 44 条

[1] [Anonymous], 2004, P 21 INT C MACHINE L, DOI DOI 10.1145/1015330.1015377
[2] Banzhaf J F., 1965, RUTGERs LAW REVIEW, V19, P317
[3] USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING
BATTITI, R
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04): : 537 - 550
[4] Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking
Bermejo, Pablo
de la Ossa, Luis
Gamez, Jose A.
Puerta, Jose M.
[J]. KNOWLEDGE-BASED SYSTEMS, 2012, 25 (01) : 35 - 44
[5] Bonow R O., 2011, Braunwald's heart disease: a textbook of cardiovascular medicine
[6] Brown G, 2012, J MACH LEARN RES, V13, P27
[7] Caruana R., 2003, Journal of Machine Learning Research, V3, P1245, DOI 10.1162/153244303322753652
[8] Conditional Mutual Information-Based Feature Selection Analyzing for Synergy and Redundancy
Cheng, Hongrong
Qin, Zhiguang
Feng, Chaosheng
Wang, Yong
Li, Fagen
[J]. ETRI JOURNAL, 2011, 33 (02) : 210 - 218
[9] Uniqueness of medical data mining
Cios, KJ
Moore, GW
[J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2002, 26 (1-2) : 1 - 24
[10] Consistency-based search in feature selection
Dash, M
Liu, HA
[J]. ARTIFICIAL INTELLIGENCE, 2003, 151 (1-2) : 155 - 176

← 1 2 3 4 5 →