A Meta-Learning Method to Select Under-Sampling Algorithms for Imbalanced Data Sets

被引:0
|
作者
de Morais, Romero F. A. B. [1 ]
Miranda, Pericles B. C. [1 ]
Silva, Ricardo M. A. [1 ]
机构
[1] Univ Fed Pernambuco, Recife, PE, Brazil
关键词
Meta-learning; Algorithm selection; Sampling algorithms;
D O I
10.1109/BRACIS.2016.65
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imbalanced data sets originating from real world problems, such as medical diagnosis, can be found pervasive. Learning from imbalanced data sets poses its own challenges, as common classifiers assume a balanced distribution of examples' classes in the data. Sampling techniques overcome the imbalance in the data by modifying the examples' classes distribution. Unfortunately, selecting a sampling technique together with its parameters is still an open problem. Current solutions include the brute-force approach (try as many techniques as possible), and the random search approach (choose the most appropriate from a random subset of techniques). In this work, we propose a new method to select sampling techniques for imbalanced data sets. It uses Meta-Learning and works by recommending a technique for an imbalanced data set based on solutions to previous problems. Our experimentation compared the proposed method against the brute-force approach, all techniques with their default parameters, and the random search approach. The results of our experimentation show that the proposed method is comparable to the brute-force approach, outperforms the techniques with their default parameters most of the time, and always surpasses the random search approach.
引用
收藏
页码:385 / 390
页数:6
相关论文
共 50 条
  • [1] Uncertainty Based Under-Sampling for Learning Naive Bayes Classifiers Under Imbalanced Data Sets
    Aridas, Christos K.
    Karlos, Stamatis
    Kanas, Vasileios G.
    Fazakis, Nikos
    Kotsiantis, Sotiris B.
    IEEE ACCESS, 2020, 8 : 2122 - 2133
  • [2] An Under-sampling Imbalanced Learning of Data Gravitation Based Classification
    Peng, Lizhi
    Yang, Bo
    Chen, Yuehui
    Zhou, Xiaoqing
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 419 - 425
  • [3] A Hybrid Under-Sampling Method (HUSBoost) to Classify Imbalanced Data
    Popel, Mahmudul Hasan
    Hasib, Khan Md
    Habib, Syed Ahsan
    Shah, Faisal Muhammad
    2018 21ST INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2018,
  • [4] Under-sampling method based on sample weight for imbalanced data
    Xiong B.
    Wang G.
    Deng W.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2016, 53 (11): : 2613 - 2622
  • [5] AN IMBALANCED DATA CLASSIFICATION METHOD BASED ON AUTOMATIC CLUSTERING UNDER-SAMPLING
    Deng, Xiaoheng
    Zhong, Weijian
    Ren, Ju
    Zeng, Detian
    Zhang, Honggang
    2016 IEEE 35TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2016,
  • [6] An Imbalanced Multi-Label Data Ensemble Learning Method Based on Safe Under-Sampling
    Sun, Zhong-Bin
    Diao, Yu-Xuan
    Ma, Su-Yang
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (10): : 3392 - 3408
  • [7] Rough Set Assisted Meta-Learning Method to Select Learning Algorithms
    Lisa Fan Minxiao Lei Department of Computer Science University of Regina Regina Saskatchewan SS A Canada
    南昌工程学院学报, 2006, (02) : 83 - 87+91
  • [8] Several SVM Ensemble Methods Integrated with Under-Sampling for Imbalanced Data Learning
    Lin, ZhiYong
    Hao, ZhiFeng
    Yang, XiaoWei
    Liu, XiaoLan
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2009, 5678 : 536 - +
  • [9] An Active Under-sampling Approach for Imbalanced Data Classification
    Yang, Zeping
    Gao, Daqi
    2012 FIFTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2012), VOL 2, 2012, : 270 - 273
  • [10] Evolutionary under-sampling based bagging ensemble method for imbalanced data classification
    Sun, Bo
    Chen, Haiyan
    Wang, Jiandong
    Xie, Hua
    FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (02) : 331 - 350