A label noise filtering method for regression based on adaptive threshold and noise score

被引:9
作者
Li, Chuang [1 ]
Mao, Zhizhong [1 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China
关键词
Noise filter; Real-valued label noise; Adaptive noise determination; Noise score; Ensemble filtering; Iterative filtering; CLASSIFICATION; PERFORMANCE; SELECTION; PREDICTION; RANKING; FUSION; TESTS; SET;
D O I
10.1016/j.eswa.2023.120422
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The quality of training data plays a decisive role in the establishment of intelligent models. Since raw data obtained from the real world are usually entwined with noise due to variety of causes, noise filtering has become an important aspect of machine learning techniques. In contrast with the extensive research conducted on noise elimination for classification purposes, papers addressing this problem for regression tasks are rather scarce. In this paper, we propose a novel noise filter to clean noisy instances with real-valued label noise. Aiming at the deficiency of the existing noise determination criterion, a new adaptive threshold-based method is first proposed. It allows a noisy instance to be adaptively defined according to the fitting difficulty levels of different datasets, and areas with different densities. Embedded with this criterion, an effective noise filtering procedure is also designed. An ensemble filtering scheme and an iterative filtering process are combined to detect as many po-tential noisy samples as possible from the original training set. According to the acquire noise detection infor-mation, a noise score for evaluating the noise level is specifically developed. The potential noisy samples whose scores exceed a reasonable threshold are further filtered, which can compensate for the possible errors incurred during the previous procedure, and contribute to more reliable filtering results. The validity of the proposed method is studied in exhaustive experiments. We discuss reasonable hyperparameters, and compare the devel-oped method with several state-of-the-art noise filters. The outcomes show that the prediction accuracy of the utilized regressor can greatly benefit from preprocessing the given raw dataset by using our method. Simulta-neously, the method is able to acquire a good balance between the elimination of noisy samples and the retention of clean samples, and consistently achieves a better noise filtering performance.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Threshold for noise in daycare: Noise level and noise variability are associated with child wellbeing in home-based childcare
    Linting, Marielle
    Groeneveld, Marleen G.
    Vermeer, Harriet J.
    van Ijzendoorn, Marinus H.
    EARLY CHILDHOOD RESEARCH QUARTERLY, 2013, 28 (04) : 960 - 971
  • [22] Recognizing Brain Tumors Using Adaptive Noise Filtering and Statistical Features
    Rasheed, Mehwish
    Iqbal, Muhammad Waseem
    Jaffar, Arfan
    Ashraf, Muhammad Usman
    Almarhabi, Khalid Ali
    Alghamdi, Ahmed Mohammed
    Bahaddad, Adel A.
    DIAGNOSTICS, 2023, 13 (08)
  • [23] More reliable biomarkers and more accurate prediction for mental disorders using a label-noise filtering-based dimensional prediction method
    Xing, Ying
    van Erp, Theo G. M.
    Pearlson, Godfrey D.
    Kochunov, Peter
    Calhoun, Vince D.
    Du, Yuhui
    ISCIENCE, 2024, 27 (03)
  • [24] An Ensemble and Iterative Recovery Strategy Based kGNN Method to Edit Data with Label Noise
    Chen, Baiyun
    Huang, Longhai
    Chen, Zizhong
    Wang, Guoyin
    MATHEMATICS, 2022, 10 (15)
  • [25] Robust Deep Softmax Regression Against Label Noise for Unsupervised Domain Adaptation
    Wu, Guangbin
    Zhang, David
    Chen, Weishan
    Zuo, Wangmeng
    Xia, Zhuang
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2019, 33 (07)
  • [26] Weak fault feature extraction of rolling bearings based on improved ensemble noise-reconstructed EMD and adaptive threshold denoising
    Yin, Chen
    Wang, Yulin
    Ma, Guocai
    Wang, Yan
    Sun, Yuxin
    He, Yan
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2022, 171
  • [27] Atmospheric PM2.5 concentration prediction and noise estimation based on adaptive unscented Kalman filtering
    Li, Jihan
    Li, Xiaoli
    Wang, Kang
    Cui, Guimei
    MEASUREMENT & CONTROL, 2021, 54 (3-4) : 292 - 302
  • [28] Mapping Urban Environmental Noise: A Land Use Regression Method
    Xie, Dan
    Liu, Yi
    Chen, Jining
    ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2011, 45 (17) : 7358 - 7364
  • [29] An adaptive and general model for label noise detection using relative probabilistic density
    Xia, Shuyin
    Huang, Longhai
    Wang, Guoyin
    Gao, Xinbo
    Shao, Yabin
    Chen, Zizhong
    KNOWLEDGE-BASED SYSTEMS, 2022, 239
  • [30] Adaptive Label Noise Cleaning with Meta-Supervision for Deep Face Recognition
    Zhang, Yaobin
    Deng, Weihong
    Zhong, Yaoyao
    Hu, Jiani
    Li, Xian
    Zhao, Dongyue
    Wen, Dongchao
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15045 - 15055