Feature Selection Using Neighborhood based Entropy

被引:4
作者
Farnaghi-Zadeh, Fatemeh [1 ]
Rahmani, Mohsen [1 ]
Amiri, Maryam [1 ]
机构
[1] Arak Univ, Fac Engn, Dept Comp Engn, Arak 3815688349, Iran
关键词
Feature Selection; Discrimination Index; Neighborhood Relations; Density; Entropy; Distinguishing Ability; EFFICIENT FEATURE-SELECTION; WORKLOAD PREDICTION; MUTUAL INFORMATION; ALGORITHM; RELEVANCE; MODEL; SET;
D O I
10.3897/jucs.79905
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Feature selection plays an important role as a preprocessing step for pattern recognition and machine learning. The goal of feature selection is to determine an optimal subset of relevant features out of a large number of features. The neighborhood discrimination index (NDI) is one of the newest and the most efficient measures to determine distinguishing ability of a feature subset. NDI is computed based on a neighborhood radius (E). Due to the significant impact of E on NDI, selecting an appropriate value of E for each data set might be challenging and very time-consuming. This paper proposes a new approach based on targEt PointS To computE neIghborhood relatioNs (EPSTEIN). At first, all the data points are sorted in the descending order of their density. Then, the highest density data points are selected as many as the number of classes. To determine the neighborhood relations, the circles centered on the target points are drawn and the points inside or on the circles are considered to be neighbors. In the next step, the significance of each feature is computed and a greedy algorithm selects appropriate features. The performance of the proposed approach is compared to both the commonest and newest methods of feature selection. The experimental results show that EPSTEIN could select more efficient subsets of features and improve the prediction accuracy of classifiers in comparison to the other state-of-the-art methods such as Correlation-based Feature Selection (CFS), Fast Correlation-Based Filter (FCBF), Heuristic Algorithm Based on Neighborhood Discrimination Index (HANDI), Ranking Based Feature Inclusion for Optimal Feature Subset (KNFI), Ranking Based Feature Elimination (KNFE) and Principal Component Analysis and Information Gain (PCA-IG).
引用
收藏
页码:1169 / 1192
页数:24
相关论文
共 53 条
[1]   Text feature selection using ant colony optimization [J].
Aghdam, Mehdi Hosseinzadeh ;
Ghasem-Aghaee, Nasser ;
Basiri, Mohammad Ehsan .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) :6843-6853
[2]   AN INTRODUCTION TO KERNEL AND NEAREST-NEIGHBOR NONPARAMETRIC REGRESSION [J].
ALTMAN, NS .
AMERICAN STATISTICIAN, 1992, 46 (03) :175-185
[3]  
Amiri M., 2022, J. Comput. Secur., V9, P1
[4]   A new efficient approach for extracting the closed episodes for workload prediction in cloud [J].
Amiri, Maryam ;
Mohammad-Khanli, Leyli ;
Mirandola, Raffaela .
COMPUTING, 2020, 102 (01) :141-200
[5]   An online learning model based on episode mining for workload prediction in cloud [J].
Amiri, Maryam ;
Mohammad-Khanli, Leyli ;
Mirandola, Raffaela .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 87 :83-101
[6]   A sequential pattern mining model for application workload prediction in cloud environment [J].
Amiri, Maryam ;
Mohammad-Khanli, Leyli ;
Mirandola, Raffaela .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2018, 105 :21-62
[7]  
[Anonymous], 2000, CORRELATION BASED FE
[8]   Local Feature Selection for Data Classification [J].
Armanfard, Narges ;
Reilly, James P. ;
Komeili, Majid .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (06) :1217-1227
[9]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[10]  
Blake C., 1998, Uci repository of machine learning databases