Imbalance: Oversampling algorithms for imbalanced classification in R

被引:59
|
作者
Cordon, Ignacio [1 ]
Garcia, Salvador [1 ]
Fernandez, Alberto [1 ]
Herrera, Francisco [1 ]
机构
[1] Univ Granada, DaSCI Andalusian Inst Data Sci & Computat Intelli, Granada, Spain
关键词
Oversampling; Imbalanced classification; Machine learning; Preprocessing; SMOTE; SOFTWARE; SMOTE;
D O I
10.1016/j.knosys.2018.07.035
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Addressing imbalanced datasets in classification tasks is a relevant topic in research studies. The main reason is that for standard classification algorithms, the success rate when identifying minority class instances may be adversely affected. Among different solutions to cope with this problem, data level techniques have shown a robust behavior. In this paper, the novel imbalance package is introduced. Written in R and C++, and available at CRAN repository, this library includes recent relevant oversampling algorithms to improve the quality of data in imbalanced datasets, prior to performing a learning task. The main features of the package, as well as some illustrative examples of its use are detailed throughout this manuscript. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:329 / 341
页数:13
相关论文
共 50 条
  • [31] Imbalanced fault classification of rolling bearing based on an improved oversampling method
    Han, Yanfang
    Li, Baozhu
    Huang, Yingkun
    Li, Liang
    Yan, Kang
    JOURNAL OF THE BRAZILIAN SOCIETY OF MECHANICAL SCIENCES AND ENGINEERING, 2023, 45 (04)
  • [32] Grouping-based Oversampling in Kernel Space for Imbalanced Data Classification
    Ren, Jinjun
    Wang, Yuping
    Cheung, Yiu-ming
    Gao, Xiao-Zhi
    Guo, Xiaofang
    PATTERN RECOGNITION, 2023, 133
  • [33] Efficient hybrid oversampling and intelligent undersampling for imbalanced big data classification
    Vairetti, Carla
    Assadi, Jose Luis
    Maldonado, Sebastian
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 246
  • [34] Combining Random Subspace Approach with smote Oversampling for Imbalanced Data Classification
    Ksieniewicz, Pawel
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2019, 2019, 11734 : 660 - 673
  • [35] An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem
    Wang, Chao-Ran
    Shao, Xin-Hui
    IEEE ACCESS, 2021, 9 : 5069 - 5082
  • [36] Similar classes latent distribution modelling-based oversampling method for imbalanced image classification
    Ye, Wei
    Dong, Minggang
    Wang, Yan
    Gan, Guojun
    Liu, Deao
    JOURNAL OF SUPERCOMPUTING, 2023, 79 (09) : 9985 - 10019
  • [37] Model-Based Oversampling for Imbalanced Sequence Classification
    Gong, Zhichen
    Chen, Huanhuan
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 1009 - 1018
  • [38] Oversampling boosting for classification of imbalanced software defect data
    Li, Guangling
    Wang, Shihai
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 4149 - 4154
  • [39] A non-parameter oversampling approach for imbalanced data classification based on hybrid natural neighbors
    Lin, Junyue
    Liang, Lu
    APPLIED INTELLIGENCE, 2025, 55 (05)
  • [40] Noise-robust oversampling for imbalanced data classification
    Liu, Yongxu
    Liu, Yan
    Yu, Bruce X. B.
    Zhong, Shenghua
    Hu, Zhejing
    PATTERN RECOGNITION, 2023, 133