Ensemble Learning with Active Example Selection for Imbalanced Biomedical Data Classification

被引:72
|
作者
Oh, Sangyoon [1 ]
Lee, Min Su [2 ,3 ]
Zhang, Byoung-Tak [2 ,3 ]
机构
[1] Ajou Univ, Div Informat & Comp Engn, WISE Lab, Suwon 443749, Kyeonggi, South Korea
[2] Seoul Natl Univ, CBIT, Seoul 151742, South Korea
[3] Seoul Natl Univ, Sch Engn & Comp Sci, Seoul 151742, South Korea
关键词
Bioinformatics; classification; interactive data exploration and discovery; mining methods and algorithms; DISCOVERY;
D O I
10.1109/TCBB.2010.96
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In biomedical data, the imbalanced data problem occurs frequently and causes poor prediction performance for minority classes. It is because the trained classifiers are mostly derived from the majority class. In this paper, we describe an ensemble learning method combined with active example selection to resolve the imbalanced data problem. Our method consists of three key components: 1) an active example selection algorithm to choose informative examples for training the classifier, 2) an ensemble learning method to combine variations of classifiers derived by active example selection, and 3) an incremental learning scheme to speed up the iterative training procedure for active example selection. We evaluate the method on six real-world imbalanced data sets in biomedical domains, showing that the proposed method outperforms both the random under sampling and the ensemble with under sampling methods. Compared to other approaches to solving the imbalanced data problem, our method excels by 0.03-0.15 points in AUC measure.
引用
收藏
页码:316 / 325
页数:10
相关论文
共 50 条
  • [31] Multicriteria Classifier Ensemble Learning for Imbalanced Data
    Wegier, Weronika
    Koziarski, Michal
    Wozniak, Micha
    IEEE ACCESS, 2022, 10 : 16807 - 16818
  • [32] A Selective Ensemble Learning Framework for ECG-Based Heartbeat Classification with Imbalanced Data
    Ge, Hongwei
    Sun, Keyi
    Sun, Liang
    Zhao, Mingde
    Wu, Chunguo
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2753 - 2755
  • [33] Ensemble weighted extreme learning machine for imbalanced data classification based on differential evolution
    Zhang, Yong
    Liu, Bo
    Cai, Jing
    Zhang, Suhua
    NEURAL COMPUTING & APPLICATIONS, 2017, 28 : S259 - S267
  • [34] Imbalanced Network Traffic Classification based on Ensemble Feature Selection
    Ding, Yaojun
    2016 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2016,
  • [35] An Ensemble Tree Classifier for Highly Imbalanced Data Classification
    Shi, Peibei
    Wang, Zhong
    JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2021, 34 (06) : 2250 - 2266
  • [36] Imbalanced Data Classification Using Weighted Voting Ensemble
    Lu, Lin
    Wozniak, Michal
    IMAGE PROCESSING AND COMMUNICATIONS: TECHNIQUES, ALGORITHMS AND APPLICATIONS, 2020, 1062 : 82 - 91
  • [37] Selection-based resampling ensemble algorithm for nonstationary imbalanced stream data learning
    Ren, Siqi
    Zhu, Wen
    Liao, Bo
    Li, Zeng
    Wang, Peng
    Li, Keqin
    Chen, Min
    Li, Zejun
    KNOWLEDGE-BASED SYSTEMS, 2019, 163 : 705 - 722
  • [38] Ensemble weighted extreme learning machine for imbalanced data classification based on differential evolution
    Yong Zhang
    Bo Liu
    Jing Cai
    Suhua Zhang
    Neural Computing and Applications, 2017, 28 : 259 - 267
  • [39] An Ensemble Tree Classifier for Highly Imbalanced Data Classification
    SHI Peibei
    WANG Zhong
    Journal of Systems Science & Complexity, 2021, 34 (06) : 2250 - 2266
  • [40] Dynamic ensemble selection classification algorithm based on window over imbalanced drift data stream
    Meng Han
    Xilong Zhang
    Zhiqiang Chen
    Hongxin Wu
    Muhang Li
    Knowledge and Information Systems, 2023, 65 : 1105 - 1128