Ensemble Learning with Active Example Selection for Imbalanced Biomedical Data Classification

被引:72
|
作者
Oh, Sangyoon [1 ]
Lee, Min Su [2 ,3 ]
Zhang, Byoung-Tak [2 ,3 ]
机构
[1] Ajou Univ, Div Informat & Comp Engn, WISE Lab, Suwon 443749, Kyeonggi, South Korea
[2] Seoul Natl Univ, CBIT, Seoul 151742, South Korea
[3] Seoul Natl Univ, Sch Engn & Comp Sci, Seoul 151742, South Korea
关键词
Bioinformatics; classification; interactive data exploration and discovery; mining methods and algorithms; DISCOVERY;
D O I
10.1109/TCBB.2010.96
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In biomedical data, the imbalanced data problem occurs frequently and causes poor prediction performance for minority classes. It is because the trained classifiers are mostly derived from the majority class. In this paper, we describe an ensemble learning method combined with active example selection to resolve the imbalanced data problem. Our method consists of three key components: 1) an active example selection algorithm to choose informative examples for training the classifier, 2) an ensemble learning method to combine variations of classifiers derived by active example selection, and 3) an incremental learning scheme to speed up the iterative training procedure for active example selection. We evaluate the method on six real-world imbalanced data sets in biomedical domains, showing that the proposed method outperforms both the random under sampling and the ensemble with under sampling methods. Compared to other approaches to solving the imbalanced data problem, our method excels by 0.03-0.15 points in AUC measure.
引用
收藏
页码:316 / 325
页数:10
相关论文
共 50 条
  • [41] Adaptive ensemble of classifiers with regularization for imbalanced data classification
    Wang, Chen
    Deng, Chengyuan
    Yu, Zhoulu
    Hui, Dafeng
    Gong, Xiaofeng
    Luo, Ruisen
    INFORMATION FUSION, 2021, 69 : 81 - 102
  • [42] An Ensemble Tree Classifier for Highly Imbalanced Data Classification
    Peibei Shi
    Zhong Wang
    Journal of Systems Science and Complexity, 2021, 34 : 2250 - 2266
  • [43] Dynamic ensemble selection classification algorithm based on window over imbalanced drift data stream
    Han, Meng
    Zhang, Xilong
    Chen, Zhiqiang
    Wu, Hongxin
    Li, Muhang
    KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (03) : 1105 - 1128
  • [44] Pairwise Learning for Imbalanced Data Classification
    Liu, Shu
    Wu, Qiang
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 186 - 189
  • [45] Unlabeled data selection for active learning in image classification
    Li, Xiongquan
    Wang, Xukang
    Chen, Xuhesheng
    Lu, Yao
    Fu, Hongpeng
    Wu, Ying Cheng
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [46] Unlabeled data selection for active learning in image classification
    Xiongquan Li
    Xukang Wang
    Xuhesheng Chen
    Yao Lu
    Hongpeng Fu
    Ying Cheng Wu
    Scientific Reports, 14
  • [47] Rarity updated ensemble with oversampling: An ensemble approach to classification of imbalanced data streams
    Nouri, Zahra
    Kiani, Vahid
    Fadishei, Hamid
    STATISTICAL ANALYSIS AND DATA MINING, 2024, 17 (01)
  • [48] ACTIVE SMOTE for Imbalanced Medical Data Classification
    Sena, Raul
    Ben Hamida, Sana
    ADVANCES IN INFORMATION SYSTEMS, ARTIFICIAL INTELLIGENCE AND KNOWLEDGE MANAGEMENT, ICIKS 2023, 2024, 486 : 81 - 97
  • [49] EMRIL: Ensemble Method based on ReInforcement Learning for binary classification in imbalanced drifting data streams
    Usman, Muhammad
    Chen, Huanhuan
    NEUROCOMPUTING, 2024, 605
  • [50] Noise Avoidance SMOTE in Ensemble Learning for Imbalanced Data
    Kim, Kyoungok
    IEEE ACCESS, 2021, 9 : 143250 - 143265