A label noise filtering method for regression based on adaptive threshold and noise score

被引:9
作者
Li, Chuang [1 ]
Mao, Zhizhong [1 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China
关键词
Noise filter; Real-valued label noise; Adaptive noise determination; Noise score; Ensemble filtering; Iterative filtering; CLASSIFICATION; PERFORMANCE; SELECTION; PREDICTION; RANKING; FUSION; TESTS; SET;
D O I
10.1016/j.eswa.2023.120422
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The quality of training data plays a decisive role in the establishment of intelligent models. Since raw data obtained from the real world are usually entwined with noise due to variety of causes, noise filtering has become an important aspect of machine learning techniques. In contrast with the extensive research conducted on noise elimination for classification purposes, papers addressing this problem for regression tasks are rather scarce. In this paper, we propose a novel noise filter to clean noisy instances with real-valued label noise. Aiming at the deficiency of the existing noise determination criterion, a new adaptive threshold-based method is first proposed. It allows a noisy instance to be adaptively defined according to the fitting difficulty levels of different datasets, and areas with different densities. Embedded with this criterion, an effective noise filtering procedure is also designed. An ensemble filtering scheme and an iterative filtering process are combined to detect as many po-tential noisy samples as possible from the original training set. According to the acquire noise detection infor-mation, a noise score for evaluating the noise level is specifically developed. The potential noisy samples whose scores exceed a reasonable threshold are further filtered, which can compensate for the possible errors incurred during the previous procedure, and contribute to more reliable filtering results. The validity of the proposed method is studied in exhaustive experiments. We discuss reasonable hyperparameters, and compare the devel-oped method with several state-of-the-art noise filters. The outcomes show that the prediction accuracy of the utilized regressor can greatly benefit from preprocessing the given raw dataset by using our method. Simulta-neously, the method is able to acquire a good balance between the elimination of noisy samples and the retention of clean samples, and consistently achieves a better noise filtering performance.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] USING LABEL NOISE ROBUST LOGISTIC REGRESSION FOR AUTOMATED UPDATING OF TOPOGRAPHIC GEOSPATIAL DATABASES
    Maas, A.
    Rottensteiner, F.
    Heipke, C.
    XXIII ISPRS CONGRESS, COMMISSION VII, 2016, 3 (07): : 133 - 140
  • [32] Individual Transition Label Noise Logistic Regression in Binary Classification for Incorrectly Labeled Data
    Lee, Seokho
    Jung, Hyelim
    TECHNOMETRICS, 2022, 64 (01) : 18 - 29
  • [33] Asynchronous Impulsive Noise Mitigation in OFDM using Adaptive Threshold Compressive Sensing
    Ren, Gaofeng
    Qiao, Shushan
    Hei, Yong
    2014 IEEE 15TH ANNUAL WIRELESS AND MICROWAVE TECHNOLOGY CONFERENCE (WAMICON), 2014,
  • [34] Optimization of impulsive noise filtering method for rolling bearing signal enhancement
    Xu, Yuanbo
    Wei, Yu
    Qu, Junsuo
    JOURNAL OF THE BRAZILIAN SOCIETY OF MECHANICAL SCIENCES AND ENGINEERING, 2023, 45 (09)
  • [35] Affine projection M-estimate subband adaptive filters for robust adaptive filtering in impulsive noise
    Zheng, Zongsheng
    Zhao, Haiquan
    SIGNAL PROCESSING, 2016, 120 : 64 - 70
  • [36] A reconstruction error-based framework for label noise detection
    Salekshahrezaee, Zahra
    Leevy, Joffrey L.
    Khoshgoftaar, Taghi M.
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [37] Label distribution-based noise correction for multiclass crowdsourcing
    Chen, Ziqi
    Jiang, Liangxiao
    Li, Chaoqun
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (09) : 5752 - 5767
  • [38] A Robust Multilabel Method Integrating Rule-Based Transparent Model, Soft Label Correlation Learning and Label Noise Resistance
    Lou, Qiongdan
    Deng, Zhaohong
    Sang, Qingbing
    Xiao, Zhiyong
    Choi, Kup-Sze
    Wang, Shitong
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (01): : 454 - 473
  • [39] A Measurement Noise Rejection Method in the Feedback Control System Based on Noise Observer
    Ning, Zongqi
    Mao, Yao
    Huang, Yongmei
    Xi, Zhou
    Zhang, Chao
    IEEE SENSORS JOURNAL, 2021, 21 (02) : 1686 - 1693
  • [40] A mixed solution-based high agreement filtering method for class noise detection in binary classification
    Samami, Maryam
    Akbari, Ebrahim
    Abdar, Moloud
    Plawiak, Pawel
    Nematzadeh, Hossein
    Basiri, Mohammad Ehsan
    Makarenkov, Vladimir
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2020, 553