IRDA: Implicit data augmentation for deep imbalanced regression

被引:1
|
作者
Zhu, Weiyao [1 ]
Wu, Ou [1 ]
Yang, Nan [1 ]
机构
[1] Tianjin Univ, Ctr Appl Math, Tianjin 300072, Peoples R China
关键词
Deep imbalanced regression; Implicit data augmentation; Regularization; Regression loss;
D O I
10.1016/j.ins.2024.120873
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Imbalanced data distributions are prevalent in real -world classification and regression tasks. Data augmentation is a commonly employed technique to mitigate this issue, with implicit methods gaining attention for their effectiveness and efficiency. However, implicit data augmentation methods have not been extensively explored in the context of regression tasks. To address this gap, we introduce IRDA, a novel learning method for regression that incorporates implicit data augmentation. Our approach includes developing a new augmentation strategy specifically tailored for deep imbalanced regression tasks, and a regression loss function that is suitable for real -world data with imbalanced label distributions and non -uniformly distributed features. We derive an easily computable surrogate loss and propose two implicit data augmentation algorithms, one incorporating meta -learning and one without. Additionally, we provide regularization perspective to offer a deeper understanding of IRDA. We evaluate IRDA on five datasets, including a large-scale dataset, demonstrating its effectiveness in mitigating the adverse effects of imbalanced data distribution and its adaptability to various regression tasks.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Stable variable selection of class-imbalanced data with precision-recall criterion
    Fu, Guang-Hui
    Xu, Feng
    Zhang, Bing-Yang
    Yi, Lun-Zhao
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2017, 171 : 241 - 250
  • [32] Cut-Thumbnail: A Novel Data Augmentation for Convolutional Neural Network
    Xie, Tianshu
    Cheng, Xuan
    Wang, Xiaomin
    Liu, Minghui
    Deng, Jiali
    Zhou, Tao
    Liu, Ming
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1627 - 1635
  • [33] (1+ε)-class Classification: an Anomaly Detection Method for Highly Imbalanced or Incomplete Data Sets
    Borisyak, Maxim
    Ryzhikov, Artem
    Ustyuzhanin, Andrey
    Derkach, Denis
    Ratnikov, Fedor
    Mineeva, Olga
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [34] Research on transformer fault diagnosis based on active learning with imbalanced data of dissolved gas in oil
    Tang, Pengfei
    Zhang, Zhonghao
    Tong, Jie
    Ma, Zhenyuan
    Long, Tianhang
    Huang, Can
    Qi, Zihao
    REVIEW OF SCIENTIFIC INSTRUMENTS, 2024, 95 (05):
  • [35] Data integrative Bayesian inference for mixtures of regression models
    Aflakparast, Mehran
    de Gunst, Mathisca
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2019, 68 (04) : 941 - 962
  • [36] Recent Advances on Penalized Regression Models for Biological Data
    Wang, Pei
    Chen, Shunjie
    Yang, Sijia
    MATHEMATICS, 2022, 10 (19)
  • [37] A Novel Data-Adaptive Regression Framework Based on Multivariate Adaptive Regression Splines for Electrocardiographic Imaging
    Onak, Onder
    Erenler, Taha
    Serinagaoglu, Yesim
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2022, 69 (02) : 963 - 974
  • [38] Dual sparse learning via data augmentation for robust facial image classification
    Shaoning Zeng
    Bob Zhang
    Yanghao Zhang
    Jianping Gou
    International Journal of Machine Learning and Cybernetics, 2020, 11 : 1717 - 1734
  • [39] A novel meta-analysis based on data augmentation and elastic data shared lasso regularization for gene expression
    Hai-Hui Huang
    Hao Rao
    Rui Miao
    Yong Liang
    BMC Bioinformatics, 23
  • [40] A novel meta-analysis based on data augmentation and elastic data shared lasso regularization for gene expression
    Huang, Hai-Hui
    Rao, Hao
    Miao, Rui
    Liang, Yong
    BMC BIOINFORMATICS, 2022, 23 (SUPPL 10)