An oversampling method for multi-class imbalanced data based on composite weights

被引:9
|
作者
Deng, Mingyang [1 ,2 ]
Guo, Yingshi [1 ]
Wang, Chang [1 ]
Wu, Fuwei [1 ]
机构
[1] Changan Univ, Sch Automobile, Xian, Peoples R China
[2] Changchun Univ Technol, Coll Automobile Engn, Coll Humanities & Informat, Changchun, Peoples R China
来源
PLOS ONE | 2021年 / 16卷 / 11期
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
ALGORITHM; CLASSIFICATION; SMOTE;
D O I
10.1371/journal.pone.0259227
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
To solve the oversampling problem of multi-class small samples and to improve their classification accuracy, we develop an oversampling method based on classification ranking and weight setting. The designed oversampling algorithm sorts the data within each class of dataset according to the distance from original data to the hyperplane. Furthermore, iterative sampling is performed within the class and inter-class sampling is adopted at the boundaries of adjacent classes according to the sampling weight composed of data density and data sorting. Finally, information assignment is performed on all newly generated sampling data. The training and testing experiments of the algorithm are conducted by using the UCI imbalanced datasets, and the established composite metrics are used to evaluate the performance of the proposed algorithm and other algorithms in comprehensive evaluation method. The results show that the proposed algorithm makes the multi-class imbalanced data balanced in terms of quantity, and the newly generated data maintain the distribution characteristics and information properties of the original samples. Moreover, compared with other algorithms such as SMOTE and SVMOM, the proposed algorithm has reached a higher classification accuracy of about 90%. It is concluded that this algorithm has high practicability and general characteristics for imbalanced multi-class samples.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Learning from Combination of Data Chunks for Multi-class Imbalanced Data
    Liu, Xu-Ying
    Li, Qian-Qian
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1680 - 1687
  • [32] An Under-Sampling Method with Support Vectors in Multi-class Imbalanced Data Classification
    Arafat, Md. Yasir
    Hoque, Sabera
    Xu, Shuxiang
    Farid, Dewan Md.
    2019 13TH INTERNATIONAL CONFERENCE ON SOFTWARE, KNOWLEDGE, INFORMATION MANAGEMENT AND APPLICATIONS (SKIMA), 2019,
  • [33] AUC Evaluation of Multi-class Classifier Performance in Imbalanced Data
    Ni, Huangjing
    Wang, Wei
    2010 INTERNATIONAL CONFERENCE ON FUTURE CONTROL AND AUTOMATION (ICFCA 2010), 2010, : 48 - 51
  • [34] Efficient DANNLO classifier for multi-class imbalanced data on Hadoop
    Satyanarayana S.
    Tayar Y.
    Prasad R.S.R.
    International Journal of Information Technology, 2019, 11 (2) : 321 - 329
  • [35] Selecting local ensembles for multi-class imbalanced data classification
    Krawczyk, Bartosz
    Cano, Alberto
    Wozniak, Michal
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [36] Undersampling with Support Vectors for Multi-Class Imbalanced Data Classification
    Krawczyk, Bartosz
    Bellinger, Colin
    Corizzo, Roberto
    Japkowicz, Nathalie
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [37] GMMSampling: a new model-based, data difficulty-driven resampling method for multi-class imbalanced data
    Naglik, Iwo
    Lango, Mateusz
    MACHINE LEARNING, 2024, 113 (08) : 5183 - 5202
  • [38] A Partial Labeling Framework for Multi-Class Imbalanced Streaming Data
    Arabmakki, Elaheh
    Kantardzic, Mehmed
    Sethi, Tegjyot Singh
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1018 - 1025
  • [39] Multi-class Ensemble Learning of Imbalanced Bidding Fraud Data
    Anowar, Farzana
    Sadaoui, Samira
    ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11489 : 352 - 358
  • [40] New imbalanced bearing fault diagnosis method based on Sample-characteristic Oversampling TechniquE (SCOTE) and multi-class LS-SVM
    Wei, Jianan
    Huang, Haisong
    Yao, Liguo
    Hu, Yao
    Fan, Qingsong
    Huang, Dong
    APPLIED SOFT COMPUTING, 2021, 101