Feature selection using rough set-based direct dependency calculation by avoiding the positive region

被引:36
作者
Raza, Muhammad Summair [1 ]
Qamar, Usman [1 ]
机构
[1] Natl Univ Sci & Technol, Coll Elect & Mech Engn E&ME, Dept Comp Engn, Islamabad, Pakistan
关键词
Positive region; Rough set theory; Dependency rules; Feature selection; Reducts; GENETIC ALGORITHM; REDUCTION; SYSTEMS; CLASSIFICATION; OPTIMIZATION; PERFORMANCE; DIAGNOSIS; SEARCH;
D O I
10.1016/j.ijar.2017.10.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is the process of selecting a subset of features from the entire dataset such that the selected subset can be used on behalf of the entire dataset to reduce further processing. There are many approaches proposed for feature selection, and recently. rough set-based feature selection approaches have become dominant. The majority of such approaches use attribute dependency as criteria to determine the feature subsets. However, this measure uses the positive region to calculate dependency, which is a computationally expensive job, consequently effecting the performance of feature selection algorithms using this measure. In this paper, we have proposed a new heuristic-based dependency calculation method. The proposed method comprises a set of two rules called Direct Dependency Calculation (DDC) to calculate attribute dependency. Direct dependency calculates the number of unique/non-unique classes directly by using attribute values. Unique classes define accurate predictors of class, while non-unique classes are not accurate predictors. Calculating unique/non-unique classes in this manner lets us avoid the time-consuming calculation of the positive region, which helps increase the performance of subsequent algorithms. A two-dimensional grid was used as an intermediate data structure to calculate dependency. We have used the proposed method with a number of feature selection algorithms using various publically available datasets to justify the proposed method. A comparison framework was used for analysis purposes. Experimental results have shown the efficiency and effectiveness of the proposed method. It was determined that execution time was reduced by 63% for calculation of the dependency using DDCs, and a 65% decrease was observed in the case of feature selection algorithms based on DDCs. The required runtime memory was decreased by 95%. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:175 / 197
页数:23
相关论文
共 41 条
[1]   Investigating the effect of correlation-based feature selection on the performance of support vector machines in reservoir characterization [J].
Akande, Kabiru O. ;
Owolabi, Taoreed O. ;
Olatunji, Sunday O. .
JOURNAL OF NATURAL GAS SCIENCE AND ENGINEERING, 2015, 22 :515-522
[2]   Feature selection in possibilistic modeling [J].
Bouhamed, S. Ammar ;
Kallel, I. Khanfir ;
Masmoudi, D. Sellami ;
Solaiman, B. .
PATTERN RECOGNITION, 2015, 48 (11) :3627-3640
[3]   Finding rough set reducts with fish swarm algorithm [J].
Chen, Yumin ;
Zhu, Qingxin ;
Xu, Huarong .
KNOWLEDGE-BASED SYSTEMS, 2015, 81 :22-29
[4]   Consistency-based search in feature selection [J].
Dash, M ;
Liu, HA .
ARTIFICIAL INTELLIGENCE, 2003, 151 (1-2) :155-176
[5]   Similarity of feature selection methods: An empirical study across data intensive classification tasks [J].
Dessi, Nicoletta ;
Pes, Barbara .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (10) :4632-4642
[6]  
Francisco Macia-Perez, 2015, DECIS SUPPORT SYST, V75, P63
[7]   Quick general reduction algorithms for inconsistent decision tables [J].
Ge Hao ;
Li Longshu ;
Xu Yi ;
Yang Chuanjian .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2017, 82 :56-80
[8]   Global mutual information-based feature selection approach using single-objective and multi-objective optimization [J].
Han, Min ;
Ren, Weijie .
NEUROCOMPUTING, 2015, 168 :47-54
[9]   Using group genetic algorithm to improve performance of attribute clustering [J].
Hong, Tzung-Pei ;
Chen, Chun-Hao ;
Lin, Feng-Shih .
APPLIED SOFT COMPUTING, 2015, 29 :371-378
[10]   A novel hybrid feature selection method based on rough set and improved harmony search [J].
Inbarani, H. Hannah ;
Bagyamathi, M. ;
Azar, Ahmad Taher .
NEURAL COMPUTING & APPLICATIONS, 2015, 26 (08) :1859-1880