Simultaneous Feature Selection and Extraction Using Feature Significance

被引:1
作者
Maji, Pradipta [1 ]
Garai, Partha [2 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700108, India
[2] Natl Inst Sci & Technol, Dept Comp Sci, Berhampur 761008, Odisha, India
关键词
Pattern recognition; data mining; feature selection; feature extraction; classification; MUTUAL INFORMATION;
D O I
10.3233/FI-2015-1164
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Dimensionality reduction of a data set by selecting or extracting relevant and nonredundant features is an essential preprocessing step used for pattern recognition, data mining, machine learning, and multimedia indexing. Among the large amount of features present in real life data sets, only a small fraction of them is effective to represent the data set accurately. Prior to analysis of the data set, preprocessing the data to obtain a smaller set of representative features and retaining the optimal salient characteristics of the data not only decrease the processing time but also lead to more compactness of the models learned and better generalization. In this regard, a novel dimensionality reduction method is presented here that simultaneously selects and extracts features using the concept of feature significance. The method is based on maximizing both relevance and significance of the reduced feature set, whereby redundancy therein is removed. The method is generic in nature in the sense that both supervised and unsupervised feature evaluation indices can be used for simultaneously feature selection and extraction. The effectiveness of the proposed method, along with a comparison with existing feature selection and extraction methods, is demonstrated on a set of real life data sets.
引用
收藏
页码:405 / 431
页数:27
相关论文
共 35 条
[1]  
[Anonymous], P INT C MACH LEARN
[2]  
[Anonymous], P PAC AS C KNOWL DIS
[3]  
[Anonymous], PATTERN RECOGNITION
[4]   TABLES FOR USE IN COMPARISONS WHOSE ACCURACY INVOLVES 2 VARIANCES, SEPARATELY ESTIMATED [J].
ASPIN, AA .
BIOMETRIKA, 1949, 36 (3-4) :290-296
[5]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[6]   On the selection and classification of independent features [J].
Bressan, M ;
Vitrià, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2003, 25 (10) :1312-1317
[7]   CLASS OF COMPUTATIONALLY EFFICIENT FEATURE SELECTION CRITERIA [J].
CHEN, CH .
PATTERN RECOGNITION, 1975, 7 (1-2) :87-94
[8]   FEATURE SELECTION WITH A LINEAR DEPENDENCE MEASURE [J].
DAS, SK .
IEEE TRANSACTIONS ON COMPUTERS, 1971, C 20 (09) :1106-&
[9]   Consistency-based search in feature selection [J].
Dash, M ;
Liu, HA .
ARTIFICIAL INTELLIGENCE, 2003, 151 (1-2) :155-176
[10]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227