Ensemble Learning with Active Example Selection for Imbalanced Biomedical Data Classification

被引:72
|
作者
Oh, Sangyoon [1 ]
Lee, Min Su [2 ,3 ]
Zhang, Byoung-Tak [2 ,3 ]
机构
[1] Ajou Univ, Div Informat & Comp Engn, WISE Lab, Suwon 443749, Kyeonggi, South Korea
[2] Seoul Natl Univ, CBIT, Seoul 151742, South Korea
[3] Seoul Natl Univ, Sch Engn & Comp Sci, Seoul 151742, South Korea
关键词
Bioinformatics; classification; interactive data exploration and discovery; mining methods and algorithms; DISCOVERY;
D O I
10.1109/TCBB.2010.96
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In biomedical data, the imbalanced data problem occurs frequently and causes poor prediction performance for minority classes. It is because the trained classifiers are mostly derived from the majority class. In this paper, we describe an ensemble learning method combined with active example selection to resolve the imbalanced data problem. Our method consists of three key components: 1) an active example selection algorithm to choose informative examples for training the classifier, 2) an ensemble learning method to combine variations of classifiers derived by active example selection, and 3) an incremental learning scheme to speed up the iterative training procedure for active example selection. We evaluate the method on six real-world imbalanced data sets in biomedical domains, showing that the proposed method outperforms both the random under sampling and the ensemble with under sampling methods. Compared to other approaches to solving the imbalanced data problem, our method excels by 0.03-0.15 points in AUC measure.
引用
收藏
页码:316 / 325
页数:10
相关论文
共 50 条
  • [21] Reinforcement Online Active Learning Ensemble for Drifting Imbalanced Data Streams
    Zhang, Hang
    Liu, Weike
    Liu, Qingbao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (08) : 3971 - 3983
  • [22] Imbalanced data classification: Using transfer learning and active sampling
    Liu, Yang
    Yang, Guoping
    Qiao, Shaojie
    Liu, Meiqi
    Qu, Lulu
    Han, Nan
    Wu, Tao
    Yuan, Guan
    Peng, Yuzhong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [23] Sparse projection infinite selection ensemble for imbalanced classification
    Ning, Zhihan
    Jiang, Zhixing
    Zhang, David
    KNOWLEDGE-BASED SYSTEMS, 2023, 262
  • [24] Dynamic Ensemble Framework for Imbalanced Data Classification
    Zhu, Tuanfei
    Hu, Xingchen
    Liu, Xinwang
    Zhu, En
    Zhu, Xinzhong
    Xu, Huiying
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (05) : 2456 - 2471
  • [25] Leveraging ensemble pruning for imbalanced data classification
    Krawczyk, Bartosz
    Wozniak, Michal
    2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 439 - 444
  • [26] Multi-window based ensemble learning for classification of imbalanced streaming data
    Li, Hu
    Wang, Ye
    Wang, Hua
    Zhou, Bin
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2017, 20 (06): : 1507 - 1525
  • [27] Multi-window based ensemble learning for classification of imbalanced streaming data
    Hu Li
    Ye Wang
    Hua Wang
    Bin Zhou
    World Wide Web, 2017, 20 : 1507 - 1525
  • [28] A Method of Imbalanced Traffic Classification Based on Ensemble Learning
    Ding, Yaojun
    2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2015, : 265 - 268
  • [29] A contemporary feature selection and classification framework for imbalanced biomedical datasets
    Bikku, Thulasi
    Nandam, Sambasiva Rao
    Akepogu, Ananda Rao
    EGYPTIAN INFORMATICS JOURNAL, 2018, 19 (03) : 191 - 198
  • [30] Multicriteria Classifier Ensemble Learning for Imbalanced Data
    Wegier, Weronika
    Koziarski, Michal
    Wozniak, Micha
    Wegier, Weronika
    IEEE Access, 2022, 10 : 16807 - 16818