Imbalance: Oversampling algorithms for imbalanced classification in R

被引:59
|
作者
Cordon, Ignacio [1 ]
Garcia, Salvador [1 ]
Fernandez, Alberto [1 ]
Herrera, Francisco [1 ]
机构
[1] Univ Granada, DaSCI Andalusian Inst Data Sci & Computat Intelli, Granada, Spain
关键词
Oversampling; Imbalanced classification; Machine learning; Preprocessing; SMOTE; SOFTWARE; SMOTE;
D O I
10.1016/j.knosys.2018.07.035
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Addressing imbalanced datasets in classification tasks is a relevant topic in research studies. The main reason is that for standard classification algorithms, the success rate when identifying minority class instances may be adversely affected. Among different solutions to cope with this problem, data level techniques have shown a robust behavior. In this paper, the novel imbalance package is introduced. Written in R and C++, and available at CRAN repository, this library includes recent relevant oversampling algorithms to improve the quality of data in imbalanced datasets, prior to performing a learning task. The main features of the package, as well as some illustrative examples of its use are detailed throughout this manuscript. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:329 / 341
页数:13
相关论文
共 50 条
  • [21] Distributional Random Oversampling for Imbalanced Text Classification
    Moreo, Alejandro
    Esuli, Andrea
    Sebastiani, Fabrizio
    SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, : 805 - 808
  • [22] Oversampling Methods for Classification of Imbalanced Breast Cancer Malignancy Data
    Krawczyk, Bartosz
    Jelen, Lukasz
    Krzyzak, Adam
    Fevens, Thomas
    COMPUTER VISION AND GRAPHICS, 2012, 7594 : 483 - 490
  • [23] An Adaptive Oversampling Technique for Imbalanced Datasets
    Shahee, Shaukat Ali
    Ananthakumar, Usha
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS (ICDM 2018), 2018, 10933 : 1 - 16
  • [24] SMOTE-BD: An Exact and Scalable Oversampling Method for Imbalanced Classification in Big Data
    Basgall, Maria Jose
    Hasperue, Waldo
    Naiouf, Marcelo
    Fernandez, Alberto
    Herrera, Francisco
    JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2018, 18 (03): : 203 - 209
  • [25] Minority oversampling for imbalanced ordinal regression
    Zhu, Tuanfei
    Lin, Yaping
    Liu, Yonghe
    Zhang, Wei
    Zhang, Jianming
    KNOWLEDGE-BASED SYSTEMS, 2019, 166 : 140 - 155
  • [26] VCOS: A Novel Synergistic Oversampling Algorithm in Binary Imbalance Classification
    Zhang, Chunkai
    Zhou, Ting
    Deng, Yepeng
    IEEE ACCESS, 2019, 7 : 145435 - 145443
  • [27] Novel Oversampling Algorithm for Handling Imbalanced Data Classification Novel Oversampling Algorithm
    More, Anjali S.
    Rana, Dipti P.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 491 - 496
  • [28] A NOVEL RULE-BASED OVERSAMPLING APPROACH FOR IMBALANCED DATA CLASSIFICATION
    Zhang, Xiao
    Paz, Ivan
    Nebot, Angela
    37TH ANNUAL EUROPEAN SIMULATION AND MODELLING CONFERENCE 2023, ESM 2023, 2023, : 208 - 212
  • [29] A novel oversampling and feature selection hybrid algorithm for imbalanced data classification
    Feng, Fang
    Li, Kuan-Ching
    Yang, Erfu
    Zhou, Qingguo
    Han, Lihong
    Hussain, Amir
    Cai, Mingjiang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (03) : 3231 - 3267
  • [30] Classification of Imbalanced Data by Oversampling in Kernel Space of Support Vector Machines
    Mathew, Josey
    Pang, Chee Khiang
    Luo, Ming
    Leong, Weng Hoe
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (09) : 4065 - 4076