Exploratory parallel hybrid sampling framework for imbalanced data classification

被引:0
|
作者
Zheng, Ming [3 ,4 ]
Zhao, Zhuo [3 ]
Wang, Fei [3 ]
Hu, Xiaowen [3 ]
Xu, Sheng [3 ,4 ]
Li, Wanggen [3 ]
Li, Tong [1 ,2 ]
机构
[1] Yunnan Agr Univ, Big Data Sch, Kunming 650201, Peoples R China
[2] Yunnan Agr Univ, Key Lab Crop Prod & Smart Agr Yunnan Prov, Kunming 650201, Peoples R China
[3] Anhui Normal Univ, Sch Comp & Informat, Wuhu 241002, Peoples R China
[4] Anhui Prov Key Lab Ind Intelligence Data Secur, Wuhu 241002, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced data; Oversampling; Undersampling; Parallel hybrid sampling framework; Serial hybrid sampling frameworks; ENSEMBLE; SMOTE;
D O I
10.1016/j.engappai.2024.109428
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current engineering application scenarios often face the challenge of imbalanced data, hybrid sampling is an effective method to deal with the imbalanced data classification issue, which can avoid the issues of overfitting and mistakenly deleting useful majority samples when using oversampling approach and undersampling approach alone. However, at present most of the hybrid sampling approaches are implemented serially, and the implementation of oversampling and undersampling approaches alone will cause mutual interference and influence between them. This study proposes a parallel hybrid sampling framework based on the idea of parallel engineering and theoretically analyzes its superiority. The experimental results show that when applied to five classification algorithms with three performance evaluation metrics,the proposed framework outperforms the two mainstream hybrid sampling frameworks. Moreover, the proposed framework can effectively reduce the time consumption of hybrid sampling process.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] CLUS: A New Hybrid Sampling Classification for Imbalanced Data
    Prachuabsupakij, Wanthanee
    PROCEEDINGS OF THE 2015 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2015, : 281 - 286
  • [2] HSNF: Hybrid sampling with two-step noise filtering for imbalanced data classification
    Duan, Lilong
    Xue, Wei
    Gu, Xiaolei
    Luo, Xiao
    He, Yongsheng
    INTELLIGENT DATA ANALYSIS, 2023, 27 (06) : 1573 - 1593
  • [3] A Hybrid Sampling SVM Approach to Imbalanced Data Classification
    Wang, Qiang
    ABSTRACT AND APPLIED ANALYSIS, 2014,
  • [4] Framework for imbalanced data classification
    Blaszczyk, Mikolaj
    Jedrzejowicz, Joanna
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 3477 - 3486
  • [5] Semi-supervised Classification Based Mixed Sampling for Imbalanced Data
    Zhao, Jianhua
    Liu, Ning
    OPEN PHYSICS, 2019, 17 (01): : 975 - 983
  • [6] Over-sampling algorithm for imbalanced data classification
    Xu Xiaolong
    Chen Wen
    Sun Yanfei
    JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2019, 30 (06) : 1182 - 1191
  • [7] A Hybrid Approach for Binary Classification of Imbalanced Data
    Tsai, Hsinhan
    Yang, Ta-Wei
    Wong, Wai-Man
    Kao, Han-Yi
    Chou, Cheng-Fu
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2024, 23 (03)
  • [8] An Approach to Imbalanced Data Classification Based on Instance Selection and Over-Sampling
    Czarnowski, Ireneusz
    Jedrzejowicz, Piotr
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, PT I, 2019, 11683 : 601 - 610
  • [9] A novel oversampling and feature selection hybrid algorithm for imbalanced data classification
    Feng, Fang
    Li, Kuan-Ching
    Yang, Erfu
    Zhou, Qingguo
    Han, Lihong
    Hussain, Amir
    Cai, Mingjiang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (03) : 3231 - 3267
  • [10] Hybrid sampling for imbalanced data
    Seiffert, Chris
    Khoshgoftaar, Taghi M.
    Van Hulse, Jason
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2009, 16 (03) : 193 - 210