Fast attribute reduction via inconsistent equivalence classes for large-scale data

被引:4
作者
Wang, Guoqiang [1 ,2 ,3 ]
Zhang, Pengfei [4 ]
Wang, Dexian
Chen, Hongmei [1 ,2 ,3 ]
Li, Tianrui [1 ,2 ,3 ]
机构
[1] Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence, Chengdu 611756, Peoples R China
[2] Southwest Jiaotong Univ, Inst Artificial Intelligence, Chengdu 611756, Peoples R China
[3] Southwest Jiaotong Univ, Natl Engn Lab Integrated Transportat Big Data Appl, Chengdu 611756, Peoples R China
[4] Chengdu Univ Tradit Chinese Med, Sch Intelligent Med, Chengdu 611137, Peoples R China
基金
中国国家自然科学基金;
关键词
Rough set; Attribute reduction; Granular computing; Hash table; Big data; DEPENDENCY CALCULATION TECHNIQUE; ROUGH SET; FEATURE-SELECTION; INDISCERNIBILITY; UNCERTAINTY; ACCELERATOR; ALGORITHMS; ENTROPY;
D O I
10.1016/j.ijar.2023.109039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection, also known as attribute reduction, plays a crucial role in machine learning and data mining tasks. Rough set theory-based feature selection methods have gained popularity due to their ability to handle imprecise and inconsistent data, ease of implementation, and generation of highly interpretable results. However, these methods still suffer from high computational cost when dealing with large-scale datasets with high dimensions. To overcome this shortcoming, we propose a fast attribute reduction method based on inconsistent equivalence classes. The presented method can accelerate those attribute reduction algorithms whose importance measures used can be computed using only inconsistent equivalence classes. Our proposed method improves attribute reduction efficiency through three key aspects: 1) transforming the original dataset into an equivalently simplified version with fewer samples, 2) accelerating the computation of core attributes, and 3) expediting the forward selection process by removing redundant objects and attributes. Experimental results demonstrate the high computational efficiency of our proposed method.
引用
收藏
页数:22
相关论文
共 59 条
[1]   A relative uncertainty measure for fuzzy rough feature selection [J].
An, Shuang ;
Liu, Jiaying ;
Wang, Changzhong ;
Zhao, Suyun .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2021, 139 :130-142
[2]   Sample Pair Selection for Attribute Reduction with Rough Set [J].
Chen, Degang ;
Zhao, Suyun ;
Zhang, Lei ;
Yang, Yongping ;
Zhang, Xiao .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (11) :2080-2093
[3]   Random sampling accelerator for attribute reduction [J].
Chen, Zhen ;
Liu, Keyu ;
Yang, Xibei ;
Fujita, Hamido .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2022, 140 :75-91
[4]   Semi-supervised attribute reduction based on label distribution and label irrelevance [J].
Dai, Jianhua ;
Huang, Weiyi ;
Wang, Weisi ;
Zhang, Chucai .
INFORMATION FUSION, 2023, 100
[5]   Consistency-based search in feature selection [J].
Dash, M ;
Liu, HA .
ARTIFICIAL INTELLIGENCE, 2003, 151 (1-2) :155-176
[6]   Pointwise mutual information sparsely embedded feature selection [J].
Deng, Tingquan ;
Huang, Yang ;
Yang, Ge ;
Wang, Changzhong .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2022, 151 :251-270
[7]   Quick attribute reduction with generalized indiscernibility models [J].
Fan Jing ;
Jiang Yunliang ;
Liu Yong .
INFORMATION SCIENCES, 2017, 397 :15-36
[8]  
FAYYAD UM, 1993, IJCAI-93, VOLS 1 AND 2, P1022
[9]   Quick general reduction algorithms for inconsistent decision tables [J].
Ge Hao ;
Li Longshu ;
Xu Yi ;
Yang Chuanjian .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2017, 82 :56-80
[10]  
Hu QH, 2007, LECT NOTES COMPUT SC, V4426, P96