Gaussian prior based adaptive synthetic sampling with non-linear sample space for imbalanced learning

被引:12
作者
Zhang, Tianlun [1 ]
Li, Yang [1 ]
Wang, Xizhao [2 ]
机构
[1] Dalian Maritime Univ, Coll Informat Sci & Technol, Dalian 116026, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced learning; Error bound model; Adaptive method; Classification algorithm; Gaussian mixture model; LOCALIZED GENERALIZATION ERROR; CLASSIFICATION; IDENTIFICATION; SELECTION; MACHINE; SMOTE;
D O I
10.1016/j.knosys.2019.105231
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the presence of skewed category distribution, most learning algorithms fail to provide favorable performance on the representation about data characteristics. Thus learning from imbalanced data is a crucial challenge in the field of data engineering and knowledge discovery. In this work, we proposed an imbalanced learning method to generate minority samples for the compensation of class distribution skews. Different from existing synthetic over-sampling techniques, the data generation is conducted within the hyperplane rather than on the hyperline, thus the proposed method breaks down the ties imposed by the linear interpolation. In addition, this proposed method minimizes the sampling uncertain and risk by integrating a prior knowledge about the minority class instances. Moreover, a multi-objective optimization combined with error bound model develops this proposed method into an adaptive imbalanced learning. Extensive experiments have been performed on imbalanced issues, and the experimental results demonstrate that this method can improve the performance of different classification algorithms. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:10
相关论文
共 36 条
[1]  
[Anonymous], 2004, ACM SIGKDD Explorations Newsletter, DOI DOI 10.1145/1007730.1007733
[2]  
[Anonymous], 2017, IEEE T CYBERN
[3]  
[Anonymous], P INT JOINT C NEUR N
[4]  
[Anonymous], 1997, P 14 INT C MACH LEAR
[5]  
Batista G. E. A. P. A., 2004, ACM SIGKDD Explor Newsl, V6, P20, DOI [10.1145/1007730.1007735, DOI 10.1145/1007730.1007735]
[6]  
Chawla N.V., 2003, P INT C MACH LEARN
[7]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[8]   An improved differential evolution and its application to determining feature weights in similarity-based clustering [J].
Dong, Chun-Ru ;
Ng, Wing W. Y. ;
Wang, Xi-Zhao ;
Chan, Patrick P. K. ;
Yeung, Daniel S. .
NEUROCOMPUTING, 2014, 146 :95-103
[9]  
Ertekin J., 2007, P 30 ANN INT ACM SIG, P823, DOI DOI 10.1145/1277741.1277927
[10]  
Ertekin S., 2007, P 16 ACM C C INF KNO, P127, DOI 10.1145/1321440.1321461