NEC: A nested equivalence class-based dependency calculation approach for fast feature selection using rough set theory

被引:13
|
作者
Zhao, Jie [1 ,2 ]
Liang, Jia-Ming [1 ]
Dong, Zhen-Ning [1 ]
Tang, De-Yu [3 ]
Liu, Zhen [3 ]
机构
[1] Guangdong Univ Technol, Sch Management, Guangzhou 510006, Peoples R China
[2] Cornell Univ, Sch Elect & Comp Engn, New York, NY 14850 USA
[3] Guangdong Pharmaceut Univ, Sch Med Informat & Engn, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature selection; Rough set theory; Attribute reduction; Positive region; Heuristic algorithm; Swarm intelligence; ATTRIBUTE REDUCTION; DECISION SYSTEMS; OPTIMIZATION; ALGORITHM; CLASSIFICATION; APPROXIMATION; TABLES; PSO;
D O I
10.1016/j.ins.2020.03.092
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Feature selection plays an important role in data mining and machine learning tasks. As one of the most effective methods for feature selection, rough set theory provides a systematic theoretical framework for consistency-based feature selection, in which positive region-based dependency calculation is the most important step. However, it is time-consuming, and although many improved algorithms have been proposed, they are still computationally time-consuming. Therefore, to overcome this shortcoming, in this study, a nested equivalence class (NEC) approach is introduced to calculate dependency. The proposed method starts from the finest partition of the universe, and then extracts and uses the known knowledge of reducts in a decision table to construct an NEC. The proposed method not only simplifies dependency calculation but also reduces the universe correspondingly, in most cases. Using the proposed NEC-based approach, a number of representative heuristic- and swarm intelligence-based feature selection algorithms that apply rough set theory were enhanced. Note that the feature subset selected by each modified algorithm and that selected by the original algorithm were the same. Experiments conducted using 33 datasets from the UCI repository and KDD Cup competition, which included large-scale and high-dimensional datasets, demonstrated the efficiency and effectiveness of the proposed method. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:431 / 453
页数:23
相关论文
共 50 条
  • [41] Consistency approximation: Incremental feature selection based on fuzzy rough set theory
    Zhao, Jie
    Wu, Daiyang
    Wu, Jiaxin
    Ye, Wenhao
    Huang, Faliang
    Wang, Jiahai
    See-To, Eric W. K.
    PATTERN RECOGNITION, 2024, 155
  • [42] An Exact Feature Selection Algorithm Based on Rough Set Theory
    Rezvan, Mohammad Taghi
    Hamadani, Ali Zeinal
    Hejazi, Seyed Reza
    COMPLEXITY, 2015, 20 (05) : 50 - 62
  • [43] Fault feature subset selection based on rough set theory
    Zhao, Yueling
    Xu, Lin
    Wang, Jianhui
    Gu, Shusheng
    Complexity Analysis and Control for Social, Economical and Biological Systems, 2006, 1 : 162 - 171
  • [44] A model based on ant colony system and rough set theory to feature selection
    Bello, R.
    Nowe, A.
    Caballero, Y.
    Gomez, Y.
    Vrancx, P.
    GECCO 2005: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOLS 1 AND 2, 2005, : 275 - 276
  • [45] Third Order Backward Elimination Approach for Fuzzy-Rough Set Based Feature Selection
    Ghosh, Soumen
    Prasad, P. S. V. S. Sai
    Rao, C. Raghavendra
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2017, 2017, 10597 : 254 - 262
  • [46] Rough Set Based Feature Selection Approach for Text Mining
    Sailaja, N. Venkata
    Sree, L. Padma
    Mangathayaru, N.
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2016, : 40 - 45
  • [47] Feature Selection Method for Network Intrusion Based on Fast Attribute Reduction of Rough Set
    Geng, Guohua
    Li, Na
    Gong, Shangfu
    2012 INTERNATIONAL CONFERENCE ON INDUSTRIAL CONTROL AND ELECTRONICS ENGINEERING (ICICEE), 2012, : 530 - 534
  • [48] A hybrid approach using rough set theory and hypergraph for feature selection on high-dimensional medical datasets
    Raman, M. R. Gauthama
    Nivethitha, Somu
    Kannan, Krithivasan
    Sriram, V. S. Shankar
    SOFT COMPUTING, 2019, 23 (23) : 12655 - 12672
  • [49] Fast calculation for approximations in Dominance-based Rough Set Approach using Dual Information Granule
    Zhao, Jie
    Wu, Daiyang
    Wu, JiaXin
    See-To, Eric W. K.
    Huang, Faliang
    APPLIED SOFT COMPUTING, 2023, 149
  • [50] Feature Selection Using Rough Set Theory from Infected Rice Plant Images
    Chatterjee, Ahan
    Roy, Swagatam
    Das, Sunanda
    COMPUTATIONAL INTELLIGENCE IN PATTERN RECOGNITION, CIPR 2020, 2020, 1120 : 417 - 427