Evaluating the Performance of Data Level Methods Using KEEL Tool to Address Class Imbalance Problem

被引:0
|
作者
Kamlesh Upadhyay
Prabhjot Kaur
Deepak Kumar Verma
机构
[1] Lingayas Vidyapeeth,Department of Information Technology
[2] Maharaja Surajmal Institute of Technology,undefined
[3] Lingayas Vidyapeeth,undefined
来源
Arabian Journal for Science and Engineering | 2022年 / 47卷
关键词
Algorithm level approaches; Binary classification; Class imbalance problem; Data level approaches; Ensembled approach;
D O I
暂无
中图分类号
学科分类号
摘要
The class imbalance problem (CIP) has become a hot topic of machine learning in recent years because of its increasing importance in today’s era. As the application area of technology is increases, the size and variety of data also increases. By nature, most of the real-world raw data is present in imbalanced form like credit card frauds, fraudulent telephone calls, shuttle system failure, text classification, nuclear explosions, oil spill detection, detection of brain tumor images etc. The classification algorithms are not able to classify imbalance data accurately and their results always deviate toward the bigger class. This problem is known as Class Imbalance Problem. This paper assess various data level methods which are used to balance the data before classification. It also discusses various characteristics of data which impact class imbalance problem and the reasons why traditional classification algorithms are not able to tackle this issue. Apart from this it also discusses about other data abnormalities which makes the CIP more critical like size of data, overlapping classes, presence of noise in the data, data distribution within each class etc. The paper empirically compared 20 data-level classification methods with 44 UCI real imbalanced data-sets with the imbalance ratio ranging from as low as to 1.82 to as high as to 129.44 using KEEL tool. The performance of the methods is assessed using AUC, F-measure, G-mean metrics and the results are analyzed and represented graphically.
引用
收藏
页码:9741 / 9754
页数:13
相关论文
共 50 条
  • [21] Benchmarking framework for class imbalance problem using novel sampling approach for big data
    Ahlawat, Khyati
    Chug, Anuradha
    Singh, Amit Prakash
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2019, 10 (04) : 824 - 835
  • [22] Benchmarking framework for class imbalance problem using novel sampling approach for big data
    Khyati Ahlawat
    Anuradha Chug
    Amit Prakash Singh
    International Journal of System Assurance Engineering and Management, 2019, 10 : 824 - 835
  • [23] Improved Ensemble Methods to Solve Multi-class Imbalance Problem Using Adaptive Weights
    Kokilam, K. Vasantha
    Latha, D. Ponmary Pushpa
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA ENGINEERING, 2018, 9 : 333 - 343
  • [24] A learning method for the class imbalance problem with medical data sets
    Li, Der-Chiang
    Liu, Chiao-Wen
    Hu, Susan C.
    COMPUTERS IN BIOLOGY AND MEDICINE, 2010, 40 (05) : 509 - 518
  • [25] A Comparative Analysis of Evolutionary Algorithms for Data Classification Using KEEL Tool
    Singh, Amrit Pal
    Gupta, Chetna
    Singh, Rashpal
    Singh, Nandini
    INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH, 2021, 12 (01) : 17 - 28
  • [26] Study of Data Generation Methods for Rotating Equipment Data Imbalance Problem
    Li J.
    Wu X.
    Liu T.
    Liu C.
    Zhendong Ceshi Yu Zhenduan/Journal of Vibration, Measurement and Diagnosis, 2023, 43 (03): : 547 - 554and623
  • [27] Using SMOTE to Deal with Class-Imbalance Problem in Bioactivity Data to Predict mTOR Inhibitors
    Kumari C.
    Abulaish M.
    Subbarao N.
    SN Computer Science, 2020, 1 (3)
  • [28] Handling Class Imbalance In Direct Marketing Dataset Using A Hybrid Data and Algorithmic Level Solutions
    Alhakbani, Haya Abdullah
    al-Rifaie, Mohammad Majid
    PROCEEDINGS OF THE 2016 SAI COMPUTING CONFERENCE (SAI), 2016, : 446 - 451
  • [29] Using Generative Adversarial Networks for Handling Class Imbalance Problem
    Aydin, M. Asli
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [30] Handling the Multi-Class Imbalance Problem using ECOC
    Valdovinos Rosas, Rosa Maria
    Abad Sanchez, Rosalinda
    Alejo Eleuterio, Roberto
    Herrera Arteaga, Edgar
    Trueba Espinosa, Adrian
    COMPUTACION Y SISTEMAS, 2013, 17 (04): : 583 - 592