Stacking density estimation and its oversampling method for continuously imbalanced data in chemometrics

被引:0
|
作者
Zhao, Xin-Ru [1 ]
Yi, Lun-Zhao [2 ]
Fu, Guang-Hui [1 ]
机构
[1] Kunming Univ Sci & Technol, Sch Sci, Kunming 650500, Peoples R China
[2] Kunming Univ Sci & Technol, Fac Food Sci & Engn, Kunming 650500, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced regression; Density estimation; Stacking; Oversampling; Rare value prediction; CLASSIFICATION; REGRESSION;
D O I
10.1016/j.chemolab.2025.105366
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Continuously imbalanced data means that the target variable is continuous and its distribution This kind of data is widespread in many practical application areas. However, methods to handle continuously imbalanced data have been relatively scarce, and there is an urgent need corresponding imbalance regression methods to enhance the capability of handling continuously data. Firstly, we propose a Stacking-based density estimation (SDE) method to solve the density problem of continuously imbalanced target variables. SDE links density estimation with the Ensemble algorithm called Stacking, and its core concept is the "fusion of multiple perspectives for accurate Performing SDE enhances the model's understanding of complex data structures and makes it more and accurate in identifying rare values. Subsequently, we investigate an SDE-based oversampling (SDE-OS). SDE-OS uses SDE to synthesize new rare instances in the rare-value region, achieving customization of rare-value additions. In a series of numerical experiments, SDE has been estimated accurately than the kernel density estimation method on ANLL. SDE-OS outperforms conventional methods such as SMOGN and SMOTER in various metrics. Therefore, the proposed SDE and SDE-OS competitive and effective tools for addressing the imbalanced regression problem.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] An oversampling method for imbalanced data based on spatial distribution of minority samples SD-KMSMOTE
    Wensheng Yang
    Chengsheng Pan
    Yanyan Zhang
    Scientific Reports, 12
  • [22] An oversampling method for imbalanced data based on spatial distribution of minority samples SD-KMSMOTE
    Yang, Wensheng
    Pan, Chengsheng
    Zhang, Yanyan
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [23] A New Segmented Oversampling Method for Imbalanced Data Classification Using Quasi-Linear SVM
    Zhou, Bo
    Li, Weite
    Hu, Jinglu
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2017, 12 (06) : 891 - 898
  • [24] Synthetic protein sequence oversampling method for classification and remote homology detection in imbalanced protein data
    Beigi, Majid M.
    Zell, Andreas
    BIOINFORMATICS RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2007, 4414 : 263 - +
  • [25] Self-adaptive oversampling method based on the complexity of minority data in imbalanced datasets classification
    Tao, Xinmin
    Guo, Xinyue
    Zheng, Yujia
    Zhang, Xiaohan
    Chen, Zhiyu
    KNOWLEDGE-BASED SYSTEMS, 2023, 277
  • [26] Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction
    Talukder, Md. Alamin
    Islam, Md. Manowarul
    Uddin, Md Ashraf
    Hasan, Khondokar Fida
    Sharmin, Selina
    Alyami, Salem A.
    Moni, Mohammad Ali
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [27] A Mixed Sampling Method for Imbalanced Data Based on Neighborhood Density
    Hu, Feng
    Yu, Chunlin
    Dai, Jin
    Liu, Ke
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2019, : 94 - 98
  • [28] Undersampling method based on minority class density for imbalanced data
    Sun, Zhongqiang
    Ying, Wenhao
    Zhang, Wenjin
    Gong, Shengrong
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [29] Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction
    Md. Alamin Talukder
    Md. Manowarul Islam
    Md Ashraf Uddin
    Khondokar Fida Hasan
    Selina Sharmin
    Salem A. Alyami
    Mohammad Ali Moni
    Journal of Big Data, 11
  • [30] Improving Diagnostic Performance of High-Voltage Circuit Breakers on Imbalanced Data Using an Oversampling Method
    Chen, Lei
    Wan, Shuting
    Dou, Longjiang
    IEEE TRANSACTIONS ON POWER DELIVERY, 2022, 37 (04) : 2704 - 2716