Gray Learning From Non-IID Data With Out-of-Distribution Samples

被引:0
|
作者
Zhao, Zhilin [1 ,2 ]
Cao, Longbing [1 ,2 ]
Wang, Chang-Dong [3 ,4 ]
机构
[1] Macquarie Univ, Data Sci Lab, Sch Comp, Sydney, NSW 2109, Australia
[2] Macquarie Univ, Data Sci Lab, DataX Res Ctr, Sydney, NSW 2109, Australia
[3] Sun Yat Sen Univ, Comp Sci & Engn, Guangdong Prov Key Lab Computat Sci, Minist Educ, Guangzhou 510275, Peoples R China
[4] Sun Yat Sen Univ, Comp Sci & Engn, Key Lab Machine Intelligence & Adv Comp, Minist Educ, Guangzhou 510275, Peoples R China
基金
澳大利亚研究理事会;
关键词
Training; Noise measurement; Task analysis; Complexity theory; Training data; Neural networks; Metalearning; Complementary label; generalization; gray learning (GL); non-independent and identically distributed (Non-IID) data; out-of-distribution data;
D O I
10.1109/TNNLS.2023.3330475
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The integrity of training data, even when annotated by experts, is far from guaranteed, especially for non-independent and identically distributed (non-IID) datasets comprising both in- and out-of-distribution samples. In an ideal scenario, the majority of samples would be in-distribution, while samples that deviate semantically would be identified as out-of-distribution and excluded during the annotation process. However, experts may erroneously classify these out-of-distribution samples as in-distribution, assigning them labels that are inherently unreliable. This mixture of unreliable labels and varied data types makes the task of learning robust neural networks notably challenging. We observe that both in- and out-of-distribution samples can almost invariably be ruled out from belonging to certain classes, aside from those corresponding to unreliable ground-truth labels. This opens the possibility of utilizing reliable complementary labels that indicate the classes to which a sample does not belong. Guided by this insight, we introduce a novel approach, termed gray learning (GL), which leverages both ground-truth and complementary labels. Crucially, GL adaptively adjusts the loss weights for these two label types based on prediction confidence levels. By grounding our approach in statistical learning theory, we derive bounds for the generalization error, demonstrating that GL achieves tight constraints even in non-IID settings. Extensive experimental evaluations reveal that our method significantly outperforms alternative approaches grounded in robust statistics.
引用
收藏
页码:1396 / 1409
页数:14
相关论文
共 50 条
  • [31] AdaDpFed: A Differentially Private Federated Learning Algorithm With Adaptive Noise on Non-IID Data
    Zhao, Zirun
    Sun, Yi
    Bashir, Ali Kashif
    Lin, Zhaowen
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 2536 - 2545
  • [32] Experimenting With Normalization Layers in Federated Learning on Non-IID Scenarios
    Casella, Bruno
    Esposito, Roberto
    Sciarappa, Antonio
    Cavazzoni, Carlo
    Aldinucci, Marco
    IEEE ACCESS, 2024, 12 : 47961 - 47971
  • [33] Reschedule Gradients: Temporal Non-IID Resilient Federated Learning
    You, Xianyao
    Liu, Ximeng
    Jiang, Nan
    Cai, Jianping
    Ying, Zuobin
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (01) : 747 - 762
  • [34] FedPKR: Federated Learning With Non-IID Data via Periodic Knowledge Review in Edge Computing
    Wang, Jinbo
    Wang, Ruijin
    Xu, Guangquan
    He, Donglin
    Pei, Xikai
    Zhang, Fengli
    Gan, Jie
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2024, 9 (06): : 902 - 912
  • [35] First-Arrival Picking for Out-of-Distribution Noisy Data: A Cost-Effective Transfer Learning Method With Tens of Samples
    Li, Hanyang
    Li, Xuegui
    Sun, Yuhang
    Dong, Hongli
    Xu, Gang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [36] FedLC: Optimizing Federated Learning in Non-IID Data via Label-Wise Clustering
    Lee, Hunmin
    Seo, Daehee
    IEEE ACCESS, 2023, 11 : 42082 - 42095
  • [37] Verifying the Generalization of Deep Learning to Out-of-Distribution Domains
    Amir, Guy
    Maayan, Osher
    Zelazny, Tom
    Katz, Guy
    Schapira, Michael
    JOURNAL OF AUTOMATED REASONING, 2024, 68 (03)
  • [38] Tackling the Non-IID Issue in Heterogeneous Federated Learning by Gradient Harmonization
    Zhang, Xinyu
    Sun, Weiyu
    Chen, Ying
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2595 - 2599
  • [39] Non-IID Medical Imaging Data on COVID-19 in the Federated Learning Framework: Impact and Directions
    Alhafiz, Fatimah Saeed
    Basuhail, Abdullah Ahmad
    COVID, 2024, 4 (12): : 1985 - 2016
  • [40] FEEL: Federated End-to-End Learning With Non-IID Data for Vehicular Ad Hoc Networks
    Li, Beibei
    Jiang, Yukun
    Pei, Qingqi
    Li, Tao
    Liu, Liang
    Lu, Rongxing
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (09) : 16728 - 16740