Imbalanced data classification using MapReduce and relief

被引:6
|
作者
Jedrzejowicz, Joanna [1 ]
Kostrzewski, Robert [1 ]
Neumann, Jakub [1 ]
Zakrzewska, Magdalena [1 ]
机构
[1] Univ Gdansk, Fac Math Phys & Informat, Inst Informat, PL-80308 Gdansk, Poland
关键词
Imbalanced data; classification; parallelization; feature selection;
D O I
10.1080/24751839.2018.1440454
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Classification of imbalanced data has been reported to require modification of standard classification algorithms and lately has attracted a lot of attention due to practical applications in industry, banking and finance. The aim of the paper is to examine algorithms known from literature when two modifications are introduced: MapReduce to parallelize computations and Relief to select most valuable attributes. Both modifications are needed in Big Data area. Also two new algorithms are considered.
引用
收藏
页码:217 / 230
页数:14
相关论文
共 50 条
  • [41] Review of imbalanced data classification methods
    Li Y.-X.
    Chai Y.
    Hu Y.-Q.
    Yin H.-P.
    Kongzhi yu Juece/Control and Decision, 2019, 34 (04): : 673 - 688
  • [42] Pairwise Learning for Imbalanced Data Classification
    Liu, Shu
    Wu, Qiang
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 186 - 189
  • [43] The Text Classification for Imbalanced Data Sets
    Li, Yanling
    Zhu, Yehang
    Yang, Ping
    ISISE 2008: INTERNATIONAL SYMPOSIUM ON INFORMATION SCIENCE AND ENGINEERING, VOL 2, 2008, : 778 - +
  • [44] Ensemble Approach for the Classification of Imbalanced Data
    Nikulin, Vladimir
    McLachlan, Geoffrey J.
    Ng, Shu Kay
    AI 2009: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, 5866 : 291 - +
  • [45] Potential Anchoring for imbalanced data classification
    Koziarski, Michal
    PATTERN RECOGNITION, 2021, 120
  • [46] CLASSIFICATION OF IMBALANCED HYPERSPECTRAL IMAGERY DATA USING SUPPORT VECTOR SAMPLING
    Zhang, Xiangrong
    Song, Qiang
    Zheng, Yaoguo
    Hou, Biao
    Gou, Shuiping
    2014 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2014,
  • [47] Imbalanced Protein Data Classification Using Ensemble FTM-SVM
    Dai, Hong-Liang
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2015, 14 (04) : 350 - 359
  • [48] Classification Using Random Forest on Imbalanced Credit Card Transaction Data
    Aktar, Hafija
    Masud, Md Abdul
    Aunto, Nusrat Jahan
    Sakib, Syed Nazmus
    2021 3RD INTERNATIONAL CONFERENCE ON SUSTAINABLE TECHNOLOGIES FOR INDUSTRY 4.0 (STI), 2021,
  • [49] Lopinavir Resistance Classification with Imbalanced Data Using Probabilistic Neural Networks
    Raposo, Leticia M.
    Arruda, Monica B.
    de Brindeiro, Rodrigo M.
    Nobre, Flavio F.
    JOURNAL OF MEDICAL SYSTEMS, 2016, 40 (03) : 1 - 7
  • [50] Lopinavir Resistance Classification with Imbalanced Data Using Probabilistic Neural Networks
    Letícia M. Raposo
    Mônica B. Arruda
    Rodrigo M. de Brindeiro
    Flavio F. Nobre
    Journal of Medical Systems, 2016, 40