ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning

被引:3566
作者
He, Haibo [1 ]
Bai, Yang [1 ]
Garcia, Edwardo A. [1 ]
Li, Shutao [2 ]
机构
[1] Stevens Inst Technol, Dept Elect & Comp Engn, Hoboken, NJ 07030 USA
[2] Hunan Univ, Coll Elect & Informat Engn, Changsha 410082, Hunan, Peoples R China
来源
2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8 | 2008年
关键词
D O I
10.1109/IJCNN.2008.4633969
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel adaptive synthetic (ADASYN) sampling approach for learning from imbalanced data sets. The essential idea of ADASYN is to use a weighted distribution for different minority class examples according to their level of difficulty in learning, where more synthetic data is generated for minority class examples that are harder to learn compared to those minority examples that are easier to learn. As a result, the ADASYN approach improves learning with respect to the data distributions in two ways: (1) reducing the bias introduced by the class imbalance, and (2) adaptively shifting the classification decision boundary toward the difficult examples. Simulation analyses on several machine learning data sets show the effectiveness of this method across five evaluation metrics.
引用
收藏
页码:1322 / 1328
页数:7
相关论文
共 36 条
[1]  
ABE N, 2003, ICML KDD 03 WORKSH L
[2]  
[Anonymous], 2004, ACM SIGKDD EXPLOR NE, DOI DOI 10.1145/1007730.1007736
[3]  
[Anonymous], UCI MACH LEARN REP
[4]  
[Anonymous], P 4 INT C KNOWLEDGE
[5]  
[Anonymous], 2003, P INT C MACH LEARN W
[6]   Distributed data mining in credit card fraud detection [J].
Chan, PK ;
Fan, W ;
Prodromidis, AL ;
Stolfo, SJ .
IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1999, 14 (06) :67-74
[7]  
CHAWLA N, 2003, ICML KDD 03 WORKSH L
[8]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[9]   SMOTEBoost: Improving prediction of the minority class in boosting [J].
Chawla, NV ;
Lazarevic, A ;
Hall, LO ;
Bowyer, KW .
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 :107-119
[10]  
CHAWLA NV, 2004, SIGKDD EXPLORATIONS, V6