Imbalanced data classification using MapReduce and relief

被引:6
|
作者
Jedrzejowicz, Joanna [1 ]
Kostrzewski, Robert [1 ]
Neumann, Jakub [1 ]
Zakrzewska, Magdalena [1 ]
机构
[1] Univ Gdansk, Fac Math Phys & Informat, Inst Informat, PL-80308 Gdansk, Poland
关键词
Imbalanced data; classification; parallelization; feature selection;
D O I
10.1080/24751839.2018.1440454
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Classification of imbalanced data has been reported to require modification of standard classification algorithms and lately has attracted a lot of attention due to practical applications in industry, banking and finance. The aim of the paper is to examine algorithms known from literature when two modifications are introduced: MapReduce to parallelize computations and Relief to select most valuable attributes. Both modifications are needed in Big Data area. Also two new algorithms are considered.
引用
收藏
页码:217 / 230
页数:14
相关论文
共 50 条
  • [1] On the use of MapReduce for imbalanced big data using Random Forest
    del Rio, Sara
    Lopez, Victoria
    Manuel Benitez, Jose
    Herrera, Francisco
    INFORMATION SCIENCES, 2014, 285 : 112 - 137
  • [2] The classification of imbalanced large data sets based on MapReduce and ensemble of ELM classifiers
    Junhai Zhai
    Sufang Zhang
    Chenxi Wang
    International Journal of Machine Learning and Cybernetics, 2017, 8 : 1009 - 1017
  • [3] The classification of imbalanced large data sets based on MapReduce and ensemble of ELM classifiers
    Zhai, Junhai
    Zhang, Sufang
    Wang, Chenxi
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2017, 8 (03) : 1009 - 1017
  • [4] Ensemble feature selection approach for imbalanced textual data using MapReduce
    Amazal H.
    Ramdani M.
    Kissi M.
    International Journal of Business Intelligence and Data Mining, 2021, 19 (04) : 395 - 417
  • [5] Classification of Microarray Data Using SVM Mapreduce
    Jenifer, X. R.
    Lawrance, R.
    2017 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT TECHNIQUES IN CONTROL, OPTIMIZATION AND SIGNAL PROCESSING (INCOS), 2017,
  • [6] Improved KD-tree based imbalanced big data classification and oversampling for MapReduce platforms
    Sleeman, William C.
    Roseberry, Martha
    Ghosh, Preetam
    Cano, Alberto
    Krawczyk, Bartosz
    APPLIED INTELLIGENCE, 2024, 54 (23) : 12558 - 12575
  • [7] Binary classification for imbalanced data using data conformity mechanism
    Zheng, Jian
    Ren, Shumiao
    Zhang, Jingyue
    Wang, Shiyan
    Li, Lin
    MULTIMEDIA SYSTEMS, 2025, 31 (01)
  • [8] Imbalanced Data Stream Classification Using Hybrid Data Preprocessing
    Bobowska, Barbara
    Klikowski, Jakub
    Wozniak, Michal
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 1168 : 402 - 413
  • [9] Cost-sensitive incremental Classification under the MapReduce framework for Mining Imbalanced Massive Data Streams
    Huang Yuwen
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2015, 18 (1-2): : 177 - 194
  • [10] Imbalanced Data Classification Using Reduced Multivariate Polynomial
    Woo, Seongyoun
    Lee, Chulhee
    REMOTELY SENSED DATA COMPRESSION, COMMUNICATIONS, AND PROCESSING XII, 2016, 9874