A Hybrid Approach Handling Imbalanced Datasets

被引:0
|
作者
Soda, Paolo [1 ]
机构
[1] Univ Campus Biomed Rome, Integrated Res Ctr, Med Informat & Comp Sci Lab, Rome, Italy
关键词
STRATEGIES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several binary classification problems exhibit imbalance in class distribution, influencing system learning. Indeed, traditional machine learning algorithms are hi sod towards the majority class, thus producing poor predictive accuracy Over the minority One. To overcome this limitation: many approaches have been proposed up to now to build artificially balanced training sets. Further to their specific drawbacks, they achieve more balanced accuracies on each class harming the global accuracy. This paper first reviews the more recent method coping with Unbalanced datasets and then proposes a strategy overcoming the main drawbacks of existing approaches. It is based on an ensemble of classifiers trained on balanced subsets of the original Unbalanced training set working in conjunction with the classifier trained on the original Unbalanced dataset. The performance of the method has been estimated on six public datasets, proving its effectiveness also in comparison with other approaches. It also gives the chance to modify the system behaviour according to the operating scenario.
引用
收藏
页码:209 / 218
页数:10
相关论文
共 50 条
  • [1] A Comparison for Handling Imbalanced Datasets
    Syaripudin, Arif
    Khodra, Masayu Leylia
    2014 INTERNATIONAL CONFERENCE OF ADVANCED INFORMATICS: CONCEPT, THEORY AND APPLICATION (ICAICTA), 2014, : 293 - 297
  • [2] Handling imbalanced datasets by partially guided hybrid sampling for pattern recognition
    Sandhan, Tushar
    Choi, Jin Young
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 1449 - 1453
  • [3] A New Hybrid Sampling Approach for Classification of Imbalanced Datasets
    Hanskunatai, Anantaporn
    PROCEEDINGS OF 2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS), 2018, : 67 - 71
  • [4] An Asymmetric Contrastive Loss for Handling Imbalanced Datasets
    Vito, Valentino
    Stefanus, Lim Yohanes
    ENTROPY, 2022, 24 (09)
  • [5] Dual Approach to Handling Imbalanced Class in Datasets Using Oversampling and Ensemble Learning Techniques
    Pristyanto, Yoga
    Nugraha, Anggit Ferdita
    Pratama, Irfan
    Dahlan, Akhmad
    Wirasakti, Lucky Adhikrisna
    PROCEEDINGS OF THE 2021 15TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2021), 2021,
  • [6] Handling Imbalanced and Overlapped Medical Datasets: A Comparative Study
    Basit, Mohammad Sarosh
    Khan, Adeeba
    Farooq, Omar
    Khan, Yusuf Uzzaman
    Shameem, Mohammad
    2022 5TH INTERNATIONAL CONFERENCE ON MULTIMEDIA, SIGNAL PROCESSING AND COMMUNICATION TECHNOLOGIES (IMPACT), 2022,
  • [7] Handling imbalanced medical datasets: review of a decade of research
    Salmi, Mabrouka
    Atif, Dalia
    Oliva, Diego
    Abraham, Ajith
    Ventura, Sebastian
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (10)
  • [8] Handling Imbalanced Datasets in the Case of Credit Card Fraud
    Ounacer, Soumaya
    Jihal, Houda
    Bayoude, Kenza
    Daif, Abderrahmane
    Azzouazi, Mohamed
    ADVANCED INTELLIGENT SYSTEMS FOR SUSTAINABLE DEVELOPMENT (AI2SD'2020), VOL 1, 2022, 1417 : 666 - 678
  • [9] LoRAS: an oversampling approach for imbalanced datasets
    Saptarshi Bej
    Narek Davtyan
    Markus Wolfien
    Mariam Nassar
    Olaf Wolkenhauer
    Machine Learning, 2021, 110 : 279 - 301
  • [10] A Practical Anonymization Approach for Imbalanced Datasets
    Majeed, Abdul
    Hwang, Seong Oun
    IT PROFESSIONAL, 2022, 24 (01) : 63 - 69