KA-Ensemble: towards imbalanced image classification ensembling under-sampling and over-sampling

被引：16

作者：

Ding, Hao ^{[1
]}

Wei, Bin ^{[2
]}

Gu, Zhaorui ^{[1
]}

Yu, Zhibin ^{[1
]}

Zheng, Haiyong ^{[1
,3
]}

Zheng, Bing ^{[1
]}

Li, Juan ^{[4
]}

机构：

[1] Ocean Univ China, Coll Informat Sci & Engn, Dept Elect Engn, Qingdao 266100, Peoples R China

[2] Qingdao Univ, Shandong Key Lab Digital Med & Comp Assisted Surg, Affiliated Hosp, Qingdao 266003, Peoples R China

[3] Univ Dundee, Sch Sci & Engn, Dept Math, Dundee DD1 4HN, Scotland

[4] Qingdao Agr Univ, Coll Mech & Elect Engn, Qingdao 266109, Peoples R China

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2020年 / 79卷 / 21-22期

基金：

中国国家自然科学基金;

关键词：

Class-imbalance learning; Under-sampling; Over-sampling; Ensemble learning; Image classification; REGRESSION; PREDICTION; SMOTE;

D O I：

10.1007/s11042-019-07856-y

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Imbalanced learning has become a research emphasis in recent years because of the growing number of class-imbalance classification problems in real applications. It is particularly challenging when the imbalanced rate is very high. Sampling, including under-sampling and over-sampling, is an intuitive and popular way in dealing with class-imbalance problems, which tries to regroup the original dataset and is also proved to be efficient. The main deficiency is that under-sampling methods usually ignore many majority class examples while over-sampling methods may easily cause over-fitting problem. In this paper, we propose a new algorithm dubbed KA-Ensemble ensembling under-sampling and over-sampling to overcome this issue. Our KA-Ensemble explores EasyEnsemble framework by under-sampling the majority class randomly and over-sampling the minority class via kernel based adaptive synthetic (Kernel-ADASYN) at meanwhile, yielding a group of balanced datasets to train corresponding classifiers separately, and the final result will be voted by all these trained classifiers. Through combining under-sampling and over-sampling in this way, KA-Ensemble is good at solving class-imbalance problems with large imbalanced rate. We evaluated our proposed method with state-of-the-art sampling methods on 9 image classification datasets with different imbalanced rates ranging from less than 2 to more than 15, and the experimental results show that our KA-Ensemble performs better in terms of accuracy (ACC), F-Measure, G-Mean, and area under curve (AUC). Moreover, it can be used in both dichotomy and multi-classification on both image classification and other class-imbalance problems.

引用

页码：14871 / 14888

页数：18

共 50 条

[1] SMOTE: Synthetic minority over-sampling technique [J].

Chawla, Nitesh V. ;

Bowyer, Kevin W. ;

Hall, Lawrence O. ;

Kegelmeyer, W. Philip .

2002, American Association for Artificial Intelligence (16)

[2] SMOTEBoost: Improving prediction of the minority class in boosting [J].

Chawla, NV ;

Lazarevic, A ;

Hall, LO ;

Bowyer, KW .

KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 :107-119

[3]

Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482

[4] On the optimality of the simple Bayesian classifier under zero-one loss [J].

Domingos, P ;

Pazzani, M .

MACHINE LEARNING, 1997, 29 (2-3) :103-130

[5]

Drummond C, 2003, WORKSH LEARN IMB DAT, V11, P1

[6] A multiple resampling method for learning from imbalanced data sets [J].

Estabrooks, A ;

Jo, TH ;

Japkowicz, N .

COMPUTATIONAL INTELLIGENCE, 2004, 20 (01) :18-36

[7]

Fan W, 1999, MACHINE LEARNING, PROCEEDINGS, P97

[8]

Fanny, 2018, Procedia Computer Science, V135, P60, DOI 10.1016/j.procs.2018.08.150

[9] A decision-theoretic generalization of on-line learning and an application to boosting [J].

Freund, Y ;

Schapire, RE .

JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139

[10]

Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1

← 1 2 3 4 5 →