Rapid and optimized parallel attribute reduction based on neighborhood rough sets and MapReduce

被引:4
|
作者
Hanuman, V. K. [1 ]
Chebrolu, Srilatha [1 ]
机构
[1] Natl Inst Technol Andhra Pradesh, Dept Comp Sci & Engn, Tadepalligudem 534101, Andhra Pradesh, India
关键词
Attribute reduction; Neighborhood rough sets; MapReduce; Neighborhood information; Data preprocessing; Computational complexity; High-dimensional data; ALGORITHM; EFFICIENT;
D O I
10.1016/j.eswa.2024.125323
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Attribute reduction is a crucial step in data pre-processing and feature engineering. It is the selection of a subset of relevant data attributes to reduce the computational complexity of machine learning models and improve their performance. Neighborhood rough set (NRS) theory provides a valuable framework for attribute reduction. It leverages neighborhood information to identify non-redundant and informative attributes for data analysis and machine learning tasks. Attribute subsets based on NRS theory are highly qualitative, producing effective prediction accuracies in Euclidean space. However, existing NRS-based solutions are resource-intensive because of the large search space required for finding neighborhoods and redundant computations. To overcome these limitations, we propose the rapid and optimized attribute reduction (ROAR) algorithm that optimizes the current state-of-the-art attribute-reduction method in NRS theory. The strength of ROAR lies in its ability to accelerate computations by rapidly determining the neighborhood consistency of data samples and consequently expediting the identification of both positive and boundary regions. This efficiency significantly enhances the overall processing time for the data analysis tasks. Experimental results on 12 standard datasets demonstrate that the ROAR algorithm exhibits high efficiency by obtaining accurate reduction results with rapid response times. To ensure that the ROAR algorithm is suitable for high-dimensional datasets, we provide a parallel implementation, namely, the P-ROAR algorithm. The P-ROAR algorithm is the first parallel attribute-reduction algorithm in the classical NRS theory. Computational speeds and scalability metrics establish that P-ROAR is much faster and more scalable for datasets with an enormous attribute space. These algorithms provide a tool for handling feature reduction in data engineering without compromising accuracy and performance.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Improving on a Rapid Attribute Reduction Algorithm Based on Neighborhood Rough Sets
    Guo, Gongzhen
    Liu, Zunren
    Lou, Chang
    Song, Xiaoxiao
    2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 236 - 240
  • [2] Attribute reduction based on neighborhood constrained fuzzy rough sets
    Hu, Meng
    Guo, Yanting
    Chen, Degang
    Tsang, Eric C. C.
    Zhang, Qingshuo
    KNOWLEDGE-BASED SYSTEMS, 2023, 274
  • [3] Tri-level attribute reduction based on neighborhood rough sets
    Lianhui Luo
    Jilin Yang
    Xianyong Zhang
    Junfang Luo
    Applied Intelligence, 2024, 54 : 3786 - 3807
  • [4] Dominance-Based Neighborhood Rough Sets and Its Attribute Reduction
    Chen, Hongmei
    Li, Tianrui
    Luo, Chuan
    Hu, Jie
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, RSKT 2015, 2015, 9436 : 89 - 99
  • [5] Tri-level attribute reduction based on neighborhood rough sets
    Luo, Lianhui
    Yang, Jilin
    Zhang, Xianyong
    Luo, Junfang
    APPLIED INTELLIGENCE, 2024, 54 (05) : 3786 - 3807
  • [6] Attribute reduction based on k-nearest neighborhood rough sets
    Wang, Changzhong
    Shi, Yunpeng
    Fan, Xiaodong
    Shao, Mingwen
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2019, 106 : 18 - 31
  • [7] A novel approach to attribute reduction based on weighted neighborhood rough sets
    Hu, Meng
    Tsang, Eric C. C.
    Guo, Yanting
    Chen, Degang
    Xu, Weihua
    KNOWLEDGE-BASED SYSTEMS, 2021, 220
  • [8] Variable radius neighborhood rough sets and attribute reduction
    Zhang, Di
    Zhu, Ping
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2022, 150 : 98 - 121
  • [9] Parallel Attribute Reduction Based on MapReduce
    Xi, Dachao
    Wang, Guoyin
    Zhang, Xuerui
    Zhang, Fan
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, RSKT 2014, 2014, 8818 : 631 - 641
  • [10] Parallel attribute reduction in dominance-based neighborhood rough set
    Chen, Hongmei
    Li, Tianrui
    Cai, Yong
    Luo, Chuan
    Fujita, Hamido
    INFORMATION SCIENCES, 2016, 373 : 351 - 368