A Hybrid Machine Learning Methodology for Imbalanced Datasets

被引:0
|
作者
Lipitakis, Anastasia-Dimitra [1 ]
Kotsiantis, Sotirios [1 ]
机构
[1] Univ Patras, Dept Math, Patras, Hellas, Greece
来源
5TH INTERNATIONAL CONFERENCE ON INFORMATION, INTELLIGENCE, SYSTEMS AND APPLICATIONS, IISA 2014 | 2014年
关键词
computational intelligence; ensembles of classifiers; imbalanced data sets; supervised machine learning; DECISION TREE; CLASSIFICATION; CLASSIFIERS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the Machine Learning systems several imbalanced data sets exhibit skewed class distributions in which most cases are allocated to a class and far fewer cases to a smaller one. A classifier induced from an imbalanced data set has usually a low error rate for the majority class and an unacceptable error rate for the minority class. In this paper a synoptic review of the various related methodologies is given, a new ensemble methodology is introduced and an experimental study with other ensembles is presented. The proposed method that combines the power of OverBagging and Rotation Forest algorithms improves the identification of a difficult small class, while keeping the classification ability of the other class in an acceptable accuracy level.
引用
收藏
页码:252 / +
页数:6
相关论文
共 50 条
  • [21] Imbalanced Learning in Massive Phishing Datasets
    Azari, Ali
    Namayanja, Josephine M.
    Kaur, Navneet
    Misal, Vasundhara
    Shukla, Suraksha
    2020 IEEE 6TH INT CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / 6TH IEEE INT CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, (HPSC) / 5TH IEEE INT CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2020, : 127 - 132
  • [22] Support Vector Machine Failure in Imbalanced Datasets
    Illan, I. A.
    Gorriz, J. M.
    Ramirez, J.
    Martinez-Murcia, F. J.
    Castillo-Barnes, D.
    Segovia, F.
    Salas-Gonzalez, D.
    UNDERSTANDING THE BRAIN FUNCTION AND EMOTIONS, PT I, 2019, 11486 : 412 - 419
  • [23] A hybrid evolutionary preprocessing method for imbalanced datasets
    Wong, Ginny Y.
    Leung, Frank H. F.
    Ling, Sai-Ho
    INFORMATION SCIENCES, 2018, 454 : 161 - 177
  • [24] Comparison of machine learning techniques to handle imbalanced COVID-19 CBC datasets
    Dorn, Marcio
    Grisci, Bruno Iochins
    Narloch, Pedro Henrique
    Feltes, Bruno Cesar
    Avila, Eduardo
    Kahmann, Alessandro
    Alho, Clarice Sampaio
    PEERJ COMPUTER SCIENCE, 2021, 7 : 1 - 34
  • [25] Machine learning (ML) techniques to predict breast cancer in imbalanced datasets: a systematic review
    Ghavidel, Arman
    Pazos, Pilar
    JOURNAL OF CANCER SURVIVORSHIP, 2025, 19 (01) : 270 - 294
  • [26] Robust predictive framework for diabetes classification using optimized machine learning on imbalanced datasets
    Abousaber, Inam
    Abdallah, Haitham F.
    El-Ghaish, Hany
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2025, 7
  • [27] A Clustering Hybrid Algorithm for Smart Datasets using Machine Learning
    Amin, Dar Masroof
    Rai, Munishwar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (09) : 165 - 172
  • [28] LEARNING IMBALANCED DATASETS WITH MAXIMUM MARGIN LOSS
    Kang, Haeyong
    Vu, Thang
    Yoo, Chang D.
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1269 - 1273
  • [29] A multi-strategy hybrid machine learning model for predicting glass-formation ability of metallic glasses based on imbalanced datasets
    Liu, Xiaowei
    Long, Zhilin
    Zhang, Wei
    Yang, Lingming
    Li, Zhuang
    JOURNAL OF NON-CRYSTALLINE SOLIDS, 2023, 621
  • [30] Distribution-Sensitive Learning for Imbalanced Datasets
    Song, Yale
    Morency, Louis-Philippe
    Davis, Randall
    2013 10TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), 2013,