Deep Discriminative Features Learning and Sampling for Imbalanced Data Problem

被引:20
作者
Liu, Yi-Hsun [1 ]
Liu, Chien-Liang [2 ]
Tseng, Vincent Shin-Mu [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu, Taiwan
[2] Natl Chiao Tung Univ, Dept Ind Engn & Management, Hsinchu, Taiwan
来源
2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) | 2018年
关键词
Imbalanced Data; Synthetic Sampling; Feature Embedding; Center Loss; Triplet Loss; CLASSIFICATION; SMOTE;
D O I
10.1109/ICDM.2018.00150
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The imbalanced data problem occurs in many application domains and is considered to be a challenging problem in machine learning and data mining. Most resampling methods for synthetic data focus on minority class without considering the data distribution of major classes. In contrast to previous works, the proposed method considers both majority classes and minority classes to learn feature embeddings and utilizes appropriate loss functions to make feature embedding as discriminative as possible. The proposed method is a comprehensive framework and different deep learning feature extractors can be utilized for different domains. We conduct experiments utilizing seven numerical datasets and one image dataset based on multiclass classification tasks. The experimental results indicate that the proposed method provides accurate and stable results.
引用
收藏
页码:1146 / 1151
页数:6
相关论文
共 24 条
[1]  
[Anonymous], 2000, P 2000 INT C ART INT
[2]  
[Anonymous], 2017, ARXIV171005381
[3]  
[Anonymous], 2017, arXiv
[4]  
[Anonymous], SIGNIFICANCE SOFTMAX
[5]  
[Anonymous], 2015, P 32 INT C MACH LEAR
[6]  
[Anonymous], 2006, CVPR
[7]  
Batista GE., 2004, ACM SIGKDD EXPL NEWS, V6, P20, DOI [DOI 10.1145/1007730.1007735, 10.1145/1007730.1007735]
[8]  
Bunkhumpornpat C, 2009, LECT NOTES ARTIF INT, V5476, P475, DOI 10.1007/978-3-642-01307-2_43
[9]  
Chawla NV, 2010, DATA MINING AND KNOWLEDGE DISCOVERY HANDBOOK, SECOND EDITION, P875, DOI 10.1007/978-0-387-09823-4_45
[10]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)