l2,1 norm regularized multi-kernel based joint nonlinear feature selection and over-sampling for imbalanced data classification

被引:28
作者
Cao, Peng [1 ]
Liu, Xiaoli [1 ]
Zhang, Jian [3 ]
Zhao, Dazhe [4 ]
Huang, Min [2 ]
Zaiane, Osmar [5 ]
机构
[1] Northeastern Univ, Coll Comp Sci & Engn, Shenyang, Peoples R China
[2] Northeastern Univ, Coll Informat Sci & Engn, Shenyang, Peoples R China
[3] Nanjing Univ Informat Sci Technol, Sch Comp Software, Nanjing, Peoples R China
[4] Northeastern Univ, Minist Educ, Key Lab Med Image Comp, Shenyang, Peoples R China
[5] Univ Alberta, Comp Sci, Edmonton, AB, Canada
基金
美国国家科学基金会; 中国国家自然科学基金; 国家高技术研究发展计划(863计划);
关键词
Imbalanced data learning; Feature selection; Classification; Multi-kernel learning; Proximal method;
D O I
10.1016/j.neucom.2016.12.036
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High dimensionality and classification of imbalanced data sets are two of the most interesting machine learning challenges. Both issues have been independently studied in the literature. In order to simultaneously explore the both issues of feature selection and oversampling, we efficiently combine two different methodological approaches in an unified kernel framework. Specifically, we proposed a novel l(2,1) norm balanced multiple kernel feature selection (l(2,1) MKFS), and designed a proximal based optimization algorithm for efficiently learning the model. Moreover, multiple kernel oversampling (MKOS) was developed to generate synthetic instances in the optimal kernel space induced by l(2,1) MKFS, so as to compensate for the class imbalanced distribution. Our experimental results on multiple UCI data and two real medical application demonstrate that jointly operating nonlinear feature selection and oversampling with l(2,1) norm multi-kernel learning framework (l(2,1) MKFSOS) can lead to a promising classification performance.
引用
收藏
页码:38 / 57
页数:20
相关论文
共 57 条
  • [11] [Anonymous], 2008, FEATURE EXTRACTION F
  • [12] [Anonymous], P IEEE T SYST MAN B
  • [13] MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning
    Barua, Sukarna
    Islam, Md. Monirul
    Yao, Xin
    Murase, Kazuyuki
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) : 405 - 425
  • [14] Barua S, 2011, LECT NOTES COMPUT SC, V7063, P735, DOI 10.1007/978-3-642-24958-7_85
  • [15] A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
    Beck, Amir
    Teboulle, Marc
    [J]. SIAM JOURNAL ON IMAGING SCIENCES, 2009, 2 (01): : 183 - 202
  • [16] Multiple Kernel Learning for Visual Object Recognition: A Review
    Bucak, Serhat S.
    Jin, Rong
    Jain, Anil K.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (07) : 1354 - 1369
  • [17] Hybrid probabilistic sampling with random subspace for imbalanced data learning
    Cao, Peng
    Zhao, Dazhe
    Zaiane, Osmar
    [J]. INTELLIGENT DATA ANALYSIS, 2014, 18 (06) : 1089 - 1108
  • [18] Ensemble-based hybrid probabilistic sampling for imbalanced data learning in lung nodule CAD
    Cao, Peng
    Yang, Jinzhu
    Li, Wei
    Zhao, Dazhe
    Zaiane, Osmar
    [J]. COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2014, 38 (03) : 137 - 150
  • [19] Addressing imbalance in multilabel classification: Measures and random resampling algorithms
    Charte, Francisco
    Rivera, Antonio J.
    del Jesus, Maria J.
    Herrera, Francisco
    [J]. NEUROCOMPUTING, 2015, 163 : 3 - 16
  • [20] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)