Gray Learning From Non-IID Data With Out-of-Distribution Samples

被引:0
|
作者
Zhao, Zhilin [1 ,2 ]
Cao, Longbing [1 ,2 ]
Wang, Chang-Dong [3 ,4 ]
机构
[1] Macquarie Univ, Data Sci Lab, Sch Comp, Sydney, NSW 2109, Australia
[2] Macquarie Univ, Data Sci Lab, DataX Res Ctr, Sydney, NSW 2109, Australia
[3] Sun Yat Sen Univ, Comp Sci & Engn, Guangdong Prov Key Lab Computat Sci, Minist Educ, Guangzhou 510275, Peoples R China
[4] Sun Yat Sen Univ, Comp Sci & Engn, Key Lab Machine Intelligence & Adv Comp, Minist Educ, Guangzhou 510275, Peoples R China
基金
澳大利亚研究理事会;
关键词
Training; Noise measurement; Task analysis; Complexity theory; Training data; Neural networks; Metalearning; Complementary label; generalization; gray learning (GL); non-independent and identically distributed (Non-IID) data; out-of-distribution data;
D O I
10.1109/TNNLS.2023.3330475
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The integrity of training data, even when annotated by experts, is far from guaranteed, especially for non-independent and identically distributed (non-IID) datasets comprising both in- and out-of-distribution samples. In an ideal scenario, the majority of samples would be in-distribution, while samples that deviate semantically would be identified as out-of-distribution and excluded during the annotation process. However, experts may erroneously classify these out-of-distribution samples as in-distribution, assigning them labels that are inherently unreliable. This mixture of unreliable labels and varied data types makes the task of learning robust neural networks notably challenging. We observe that both in- and out-of-distribution samples can almost invariably be ruled out from belonging to certain classes, aside from those corresponding to unreliable ground-truth labels. This opens the possibility of utilizing reliable complementary labels that indicate the classes to which a sample does not belong. Guided by this insight, we introduce a novel approach, termed gray learning (GL), which leverages both ground-truth and complementary labels. Crucially, GL adaptively adjusts the loss weights for these two label types based on prediction confidence levels. By grounding our approach in statistical learning theory, we derive bounds for the generalization error, demonstrating that GL achieves tight constraints even in non-IID settings. Extensive experimental evaluations reveal that our method significantly outperforms alternative approaches grounded in robust statistics.
引用
收藏
页码:1396 / 1409
页数:14
相关论文
共 50 条
  • [1] Federated Learning With Taskonomy for Non-IID Data
    Jamali-Rad, Hadi
    Abdizadeh, Mohammad
    Singh, Anuj
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 8719 - 8730
  • [2] Federated Learning With Non-IID Data: A Survey
    Lu, Zili
    Pan, Heng
    Dai, Yueyue
    Si, Xueming
    Zhang, Yan
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (11): : 19188 - 19209
  • [3] A Study of Enhancing Federated Learning on Non-IID Data with Server Learning
    Mai V.S.
    La R.J.
    Zhang T.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (11): : 1 - 15
  • [4] Federated Analytics Informed Distributed Industrial IoT Learning With Non-IID Data
    Wang, Zibo
    Zhu, Yifei
    Wang, Dan
    Han, Zhu
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2023, 10 (05): : 2924 - 2939
  • [5] Federated Learning With Non-IID Data in Wireless Networks
    Zhao, Zhongyuan
    Feng, Chenyuan
    Hong, Wei
    Jiang, Jiamo
    Jia, Chao
    Quek, Tony Q. S.
    Peng, Mugen
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2022, 21 (03) : 1927 - 1942
  • [6] Overcoming Noisy Labels and Non-IID Data in Edge Federated Learning
    Xu, Yang
    Liao, Yunming
    Wang, Lun
    Xu, Hongli
    Jiang, Zhida
    Zhang, Wuyang
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (12) : 11406 - 11421
  • [7] Ensemble Federated Learning With Non-IID Data in Wireless Networks
    Zhao, Zhongyuan
    Wang, Jingyi
    Hong, Wei
    Quek, Tony Q. S.
    Ding, Zhiguo
    Peng, Mugen
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (04) : 3557 - 3571
  • [8] FedPD: A Federated Learning Framework With Adaptivity to Non-IID Data
    Zhang, Xinwei
    Hong, Mingyi
    Dhople, Sairaj
    Yin, Wotao
    Liu, Yang
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 (69) : 6055 - 6070
  • [9] FedKT: Federated learning with knowledge transfer for non-IID data
    Mao, Wenjie
    Yu, Bin
    Zhang, Chen
    Qin, A. K.
    Xie, Yu
    PATTERN RECOGNITION, 2025, 159
  • [10] Feature Matching Data Synthesis for Non-IID Federated Learning
    Li, Zijian
    Sun, Yuchang
    Shao, Jiawei
    Mao, Yuyi
    Wang, Jessie Hui
    Zhang, Jun
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (10) : 9352 - 9367