Dynamic Synthetic Minority Over-Sampling Technique-Based Rotation Forest for the Classification of Imbalanced Hyperspectral Data

被引:54
作者
Feng, Wei [1 ,2 ]
Dauphin, Gabriel [3 ]
Huang, Wenjiang [1 ]
Quan, Yinghui [4 ]
Bao, Wenxing [5 ]
Wu, Mingquan [6 ]
Li, Qiang [7 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Key Lab Digital Earth Sci, Beijing 100094, Peoples R China
[2] Xidian Univ, Sch Elect Engn, Xian 710071, Shaanxi, Peoples R China
[3] Univ Paris 13, Inst Galilee, L2TI, Lab Informat Proc & Transmiss, F-93430 Villetaneuse, France
[4] Xidian Univ, Key Lab Radar Signal Proc, Xian 710071, Shaanxi, Peoples R China
[5] Beifang Univ, Sch Comp Sci & Engn, Yinchuan 750021, Peoples R China
[6] Chinese Acad Sci, Aerosp Informat Res Inst, State Key Lab Remote Sensing Sci, Beijing 100101, Peoples R China
[7] Chinese Acad Sci, Inst Theoret Phys, Beijing 100190, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Ensemble learning; hyperspectral image classification; imbalance learning; rotation forest (RoF); ENSEMBLES; SELECTION; SMOTE;
D O I
10.1109/JSTARS.2019.2922297
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Rotation forest (RoF) is a powerful ensemble classifier and has attracted substantial attention due to its performance in hyperspectral data classification. Multi-class imbalance learning is one of the biggest challenges in machine learning and remote sensing. The standard technique for constructing RoF ensemble tends to increase the overall accuracy; RoF has difficulty to sufficiently recognize the minority class. This paper proposes a novel dynamic SMOTE (synthetic minority oversampling technique)-based RoF algorithm for the multi-class imbalance problem. The main idea of the proposed method is to dynamically balance the class distribution before building each rotation decision tree. A resampling rate is set in each iteration (ranging from 10% in the first iteration to 100% in the last) and this ratio defines the number of minority class instances randomly resampled (with replacement) from the original dataset in each iteration. The rest of the minority class instances are generated by the SMOTE method. The reported results on three real hyperspectral datasets show that the proposed method can get better performance than random forest, RoF, and some popular data sampling methods.
引用
收藏
页码:2159 / 2169
页数:11
相关论文
共 40 条
[11]   Kernel based online learning for imbalance multiclass classification [J].
Ding, Shuya ;
Mirza, Bilal ;
Lin, Zhiping ;
Cao, Jiuwen ;
Lai, Xiaoping ;
Nguyen, Tam V. ;
Sepulveda, Jose .
NEUROCOMPUTING, 2018, 277 :139-148
[12]   Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE [J].
Douzas, Georgios ;
Bacao, Fernando ;
Last, Felix .
INFORMATION SCIENCES, 2018, 465 :1-20
[13]   Random Forest and Rotation Forest for fully polarized SAR image classification using polarimetric and spatial features [J].
Du, Peijun ;
Samat, Alim ;
Waske, Bjoern ;
Liu, Sicong ;
Li, Zhenhong .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2015, 105 :38-53
[14]  
Ertekin Seyda, 2007, INT C INF KNOWL MANA, P127, DOI DOI 10.1145/1321440.1321461
[15]   Class Imbalance Ensemble Learning Based on the Margin Theory [J].
Feng, Wei ;
Huang, Wenjiang ;
Ren, Jinchang .
APPLIED SCIENCES-BASEL, 2018, 8 (05)
[16]   Weight-Based Rotation Forest for Hyperspectral Image Classification [J].
Feng, Wei ;
Bao, Wenxing .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2017, 14 (11) :2167-2171
[17]  
Feng W, 2015, IEEE IMAGE PROC, P4698, DOI 10.1109/ICIP.2015.7351698
[18]   EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling [J].
Galar, Mikel ;
Fernandez, Alberto ;
Barrenechea, Edurne ;
Herrera, Francisco .
PATTERN RECOGNITION, 2013, 46 (12) :3460-3471
[19]   Dynamic ensemble selection for multi-class imbalanced datasets [J].
Garcia, Salvador ;
Zhang, Zhong-Liang ;
Altalhi, Abdulrahman ;
Alshomrani, Saleh ;
Herrera, Francisco .
INFORMATION SCIENCES, 2018, 445 :22-37
[20]  
Gustavo E. A., 2004, SIGKDD Explor., V200, P20, DOI DOI 10.1145/1007730.1007735