Stateful MapReduce Framework for mRMR Feature Selection Using Horizontal Partitioning

被引:0
作者
Yelleti, Vivek [1 ]
Prasad, P. S. V. S. Sai [1 ]
机构
[1] Univ Hyderabad, Sch Comp & Informat Sci, Hyderabad 500046, Telangana, India
来源
PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2021 | 2024年 / 13102卷
关键词
Feature Selection; mRMR; Big data; Horizontal partitioning; MapReduce; Iterative MapReduce;
D O I
10.1007/978-3-031-12700-7_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection (FS) is an important pre-processing step in building machine learning models. minimum Redundancy and Maximum Relevance (mRMR) approach has emerged as one of the successful algorithms in obtaining irredundant feature subset involving only bivariate computations. In the current digital age, owing to the prevalence of very large scale datasets, an imminent need has arisen for scalable solutions using distributed/parallel algorithms. MapReduce solutions are proven to be one of the best approaches to design fault-tolerant and scalable solutions. This work analyses the existing Horizontal MapReduce approaches for mRMR feature selection and identifies the limitations thereof. It is observed that existing approaches involve redundant and repetitive computations and lacks a metadata framework to diminish them. This motivated us to propose Horizontal partitioning based MapReduce solutions namely HMR_mRMR, is an Iterative MapReduce algorithms and is designed under Apache Spark. Appropriate usage of metadata framework and solution formulation optimizes the computations in the proposed approaches. The comparative experimental study is conducted with existing approaches to establish the importance of HMR_mRMR.
引用
收藏
页码:317 / 327
页数:11
相关论文
共 50 条
  • [41] A MAPREDUCE BASED FRAMEWORK TO PERFORM FULL MODEL SELECTION IN VERY LARGE DATASETS
    Diaz Pacheco, Angel
    Gonzalez-Bernal, Jesus A.
    Reyes-Garcia, Carlos A.
    IADIS-INTERNATIONAL JOURNAL ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2018, 13 (01): : 1 - 13
  • [42] A Novel Feature Selection Method Based on MRMR and Enhanced Flower Pollination Algorithm for High Dimensional Biomedical Data
    Yan, Chaokun
    Li, Mengyuan
    Ma, Jingjing
    Liao, Yi
    Luo, Huimin
    Wang, Jianlin
    Luo, Junwei
    CURRENT BIOINFORMATICS, 2022, 17 (02) : 133 - 149
  • [43] Mrmr+ and Cfs+ feature selection algorithms for high-dimensional data
    Adrian Pino Angulo
    Kilho Shin
    Applied Intelligence, 2019, 49 : 1954 - 1967
  • [44] A framework for building hypercubes using MapReduce
    Tapiador, D.
    O'Mullane, W.
    Brown, A. G. A.
    Luri, X.
    Huedo, E.
    Osuna, P.
    COMPUTER PHYSICS COMMUNICATIONS, 2014, 185 (05) : 1429 - 1438
  • [45] Classification of the weather images with the proposed hybrid model using deep learning, SVM classifier, and mRMR feature selection methods
    Yildirim, Muhammed
    cinar, Ahmet
    CengIl, Emine
    GEOCARTO INTERNATIONAL, 2022, 37 (09) : 2735 - 2745
  • [46] Mrmr plus and Cfs plus feature selection algorithms for high-dimensional data
    Angulo, Adrian Pino
    Shin, Kilho
    APPLIED INTELLIGENCE, 2019, 49 (05) : 1954 - 1967
  • [47] A Deep Feature Learning Model for Pneumonia Detection Applying a Combination of mRMR Feature Selection and Machine Learning Models
    Togacar, M.
    Ergen, B.
    Comert, Z.
    Ozyurt, F.
    IRBM, 2020, 41 (04) : 212 - 222
  • [48] Enhancing iris recognition framework using feature selection and BPNN
    A. Alice Nithya
    C. Lakshmi
    Cluster Computing, 2019, 22 : 12363 - 12372
  • [49] SmartGrids: MapReduce Framework using Hadoop
    Fanibhare, Vaibhav
    Dahake, Vijay
    2016 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2016, : 406 - 411
  • [50] Enhancing iris recognition framework using feature selection and BPNN
    Nithya, A. Alice
    Lakshmi, C.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 5): : 12363 - 12372