Imbalanced classification;
Least squares support numerical spectrum;
Minority samples weights;
Oversampling;
k* information nearest neighbors;
SMOTE;
MACHINE;
D O I:
10.1016/j.knosys.2020.106116
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
As the essence of machine learning, classification is widely used in real life, however, imbalanced data has brought great challenges to classification problems. This is because standard classifiers tend to favor the majority instances and ignore the minority instances. The new oversampling algorithms (e.g. A-SUWO) based on the improving majority weighted minority oversampling (IMWMO) method assign weights through the Euclidean distances from majority instances to hard-to-learn minority instances, and then guide the synthesis of minority samples according to the weights to address the offset of the classification hyperplanes. A-SUWO has achieved better results than traditional oversampling algorithms (e.g. SMOTE and MWMOTE, etc.), when its parameters are well adjusted. However, A-SUWO may give minority training samples inappropriate weights in some irregularly distributed scenarios and make learning tasks even more harder. Additionally, A-SUWO's knn synthesizing method may not obtain wider and more effective instances. Therefore, we propose an improving adaptive semi-unsupervised weighted oversampling (IA-SUWO) technique to address the imbalanced classification problems more effectively. The improvement of IA-SUWO mainly focuses on the following two aspects: (1) comprehensively considering the least squares support numerical spectrum values and the IMWMO method to assign weights to minority instances, and (2) synthesizing new instances using the k* information nearest neighbors (k*INN) method. IA-SUWO aims to maximize the probability that all important minority samples will be drawn and generates more efficient (more scattered) boundary instances. Results demonstrate that IA-SUWO achieves significantly better results in most datasets compared with other 10 oversampling algorithms and 2 ensemble algorithms. (C) 2020 Elsevier B.V. All rights reserved.
机构:
Henan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R ChinaHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Zhang, Chongsheng
;
Bi, Jingjun
论文数: 0引用数: 0
h-index: 0
机构:
Henan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R ChinaHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Bi, Jingjun
;
Xu, Shixin
论文数: 0引用数: 0
h-index: 0
机构:
Henan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R ChinaHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Xu, Shixin
;
Ramentol, Enislay
论文数: 0引用数: 0
h-index: 0
机构:
SICS Swedish ICT, Isafjordsgatan 22,Box 1263, SE-16429 Kista, SwedenHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Ramentol, Enislay
;
Fan, Gaojuan
论文数: 0引用数: 0
h-index: 0
机构:
Henan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R ChinaHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Fan, Gaojuan
;
Qiao, Baojun
论文数: 0引用数: 0
h-index: 0
机构:
Henan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R ChinaHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Qiao, Baojun
;
Fujita, Hamido
论文数: 0引用数: 0
h-index: 0
机构:
Ho Chi Minh City Univ Technol HUTECH, Fac Informat Technol, Ho Chi Minh City, Vietnam
Iwate Prefectural Univ, Fac Software & Informat Sci, Takizawa, Iwate 0200693, JapanHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
机构:
East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R ChinaEast China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
Zhang, Jianhua
;
Cui, Xiqing
论文数: 0引用数: 0
h-index: 0
机构:
East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R ChinaEast China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
Cui, Xiqing
;
Li, Jianrong
论文数: 0引用数: 0
h-index: 0
机构:
East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R ChinaEast China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
Li, Jianrong
;
Wang, Rubin
论文数: 0引用数: 0
h-index: 0
机构:
East China Univ Sci & Technol, Sch Sci, Shanghai 200237, Peoples R ChinaEast China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
机构:
Henan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R ChinaHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Zhang, Chongsheng
;
Bi, Jingjun
论文数: 0引用数: 0
h-index: 0
机构:
Henan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R ChinaHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Bi, Jingjun
;
Xu, Shixin
论文数: 0引用数: 0
h-index: 0
机构:
Henan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R ChinaHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Xu, Shixin
;
Ramentol, Enislay
论文数: 0引用数: 0
h-index: 0
机构:
SICS Swedish ICT, Isafjordsgatan 22,Box 1263, SE-16429 Kista, SwedenHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Ramentol, Enislay
;
Fan, Gaojuan
论文数: 0引用数: 0
h-index: 0
机构:
Henan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R ChinaHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Fan, Gaojuan
;
Qiao, Baojun
论文数: 0引用数: 0
h-index: 0
机构:
Henan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R ChinaHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
Qiao, Baojun
;
Fujita, Hamido
论文数: 0引用数: 0
h-index: 0
机构:
Ho Chi Minh City Univ Technol HUTECH, Fac Informat Technol, Ho Chi Minh City, Vietnam
Iwate Prefectural Univ, Fac Software & Informat Sci, Takizawa, Iwate 0200693, JapanHenan Univ, Big Data Res Ctr, Kaifeng 475001, Peoples R China
机构:
East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R ChinaEast China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
Zhang, Jianhua
;
Cui, Xiqing
论文数: 0引用数: 0
h-index: 0
机构:
East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R ChinaEast China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
Cui, Xiqing
;
Li, Jianrong
论文数: 0引用数: 0
h-index: 0
机构:
East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R ChinaEast China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
Li, Jianrong
;
Wang, Rubin
论文数: 0引用数: 0
h-index: 0
机构:
East China Univ Sci & Technol, Sch Sci, Shanghai 200237, Peoples R ChinaEast China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China