Stateful MapReduce Framework for mRMR Feature Selection Using Horizontal Partitioning

被引:0
作者
Yelleti, Vivek [1 ]
Prasad, P. S. V. S. Sai [1 ]
机构
[1] Univ Hyderabad, Sch Comp & Informat Sci, Hyderabad 500046, Telangana, India
来源
PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2021 | 2024年 / 13102卷
关键词
Feature Selection; mRMR; Big data; Horizontal partitioning; MapReduce; Iterative MapReduce;
D O I
10.1007/978-3-031-12700-7_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection (FS) is an important pre-processing step in building machine learning models. minimum Redundancy and Maximum Relevance (mRMR) approach has emerged as one of the successful algorithms in obtaining irredundant feature subset involving only bivariate computations. In the current digital age, owing to the prevalence of very large scale datasets, an imminent need has arisen for scalable solutions using distributed/parallel algorithms. MapReduce solutions are proven to be one of the best approaches to design fault-tolerant and scalable solutions. This work analyses the existing Horizontal MapReduce approaches for mRMR feature selection and identifies the limitations thereof. It is observed that existing approaches involve redundant and repetitive computations and lacks a metadata framework to diminish them. This motivated us to propose Horizontal partitioning based MapReduce solutions namely HMR_mRMR, is an Iterative MapReduce algorithms and is designed under Apache Spark. Appropriate usage of metadata framework and solution formulation optimizes the computations in the proposed approaches. The comparative experimental study is conducted with existing approaches to establish the importance of HMR_mRMR.
引用
收藏
页码:317 / 327
页数:11
相关论文
共 50 条
  • [31] A fast and novel approach based on grouping and weighted mRMR for feature selection and classification of protein sequence data
    Kaur, Kiranpreet
    Patil, Nagamma
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2020, 23 (01) : 47 - 61
  • [32] Comparison of SFS and mRMR for oximetry feature selection in obstructive sleep apnea detection
    Sheikh Shanawaz Mostafa
    Fernando Morgado-Dias
    Antonio G. Ravelo-García
    Neural Computing and Applications, 2020, 32 : 15711 - 15731
  • [33] Design of metaheuristic rough set-based feature selection and rule-based medical data classification model on MapReduce framework
    Bhukya, Hanumanthu
    Manchala, Sadanandam
    JOURNAL OF INTELLIGENT SYSTEMS, 2022, 31 (01) : 1002 - 1013
  • [34] Feature Selection Using a Neural Framework With Controlled Redundancy
    Chakraborty, Rudrasis
    Pal, Nikhil R.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (01) : 35 - 50
  • [35] An Intelligent Metaheuristic Binary Pigeon Optimization-Based Feature Selection and Big Data Classification in a MapReduce Environment
    Abukhodair, Felwa
    Alsaggaf, Wafaa
    Jamal, Amani Tariq
    Abdel-Khalek, Sayed
    Mansour, Romany F.
    MATHEMATICS, 2021, 9 (20)
  • [36] Comparison of SFS and mRMR for oximetry feature selection in obstructive sleep apnea detection
    Mostafa, Sheikh Shanawaz
    Morgado-Dias, Fernando
    Ravelo-Garcia, Antonio G.
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (20) : 15711 - 15731
  • [37] A Balanced Partitioning Mechanism Using Collapsed-Condensed Trie in MapReduce
    Chen, Hsing-Lung
    Chen, Syu-Huan
    2018 IEEE 8TH INTERNATIONAL SYMPOSIUM ON CLOUD AND SERVICE COMPUTING (SC2), 2018, : 97 - 102
  • [38] DOA Estimation by Feature Extraction Based on Parallel Deep Neural Networks and MRMR Feature Selection Algorithm
    Al-Tameemi, Ashwaq Neaman Hassan
    Feghhi, Mahmood Mohassel
    Tazehkand, Behzad Mozaffari
    IEEE ACCESS, 2025, 13 : 40480 - 40502
  • [39] Big Data Analysis Solutions using MapReduce Framework
    Elagib, Sara B.
    Najeeb, Atahur Rahman
    Hashim, Aisha H.
    Olanrewaju, Rashidah F.
    2014 INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING (ICCCE), 2014, : 127 - 130
  • [40] Feature based Composite Approach for Sarcasm Detection using MapReduce
    Parmar, Krishna
    Limbasiya, Nivid
    Dhamecha, Maulik
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2018), 2018, : 587 - 591