Gray Learning From Non-IID Data With Out-of-Distribution Samples

被引:0
|
作者
Zhao, Zhilin [1 ,2 ]
Cao, Longbing [1 ,2 ]
Wang, Chang-Dong [3 ,4 ]
机构
[1] Macquarie Univ, Data Sci Lab, Sch Comp, Sydney, NSW 2109, Australia
[2] Macquarie Univ, Data Sci Lab, DataX Res Ctr, Sydney, NSW 2109, Australia
[3] Sun Yat Sen Univ, Comp Sci & Engn, Guangdong Prov Key Lab Computat Sci, Minist Educ, Guangzhou 510275, Peoples R China
[4] Sun Yat Sen Univ, Comp Sci & Engn, Key Lab Machine Intelligence & Adv Comp, Minist Educ, Guangzhou 510275, Peoples R China
基金
澳大利亚研究理事会;
关键词
Training; Noise measurement; Task analysis; Complexity theory; Training data; Neural networks; Metalearning; Complementary label; generalization; gray learning (GL); non-independent and identically distributed (Non-IID) data; out-of-distribution data;
D O I
10.1109/TNNLS.2023.3330475
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The integrity of training data, even when annotated by experts, is far from guaranteed, especially for non-independent and identically distributed (non-IID) datasets comprising both in- and out-of-distribution samples. In an ideal scenario, the majority of samples would be in-distribution, while samples that deviate semantically would be identified as out-of-distribution and excluded during the annotation process. However, experts may erroneously classify these out-of-distribution samples as in-distribution, assigning them labels that are inherently unreliable. This mixture of unreliable labels and varied data types makes the task of learning robust neural networks notably challenging. We observe that both in- and out-of-distribution samples can almost invariably be ruled out from belonging to certain classes, aside from those corresponding to unreliable ground-truth labels. This opens the possibility of utilizing reliable complementary labels that indicate the classes to which a sample does not belong. Guided by this insight, we introduce a novel approach, termed gray learning (GL), which leverages both ground-truth and complementary labels. Crucially, GL adaptively adjusts the loss weights for these two label types based on prediction confidence levels. By grounding our approach in statistical learning theory, we derive bounds for the generalization error, demonstrating that GL achieves tight constraints even in non-IID settings. Extensive experimental evaluations reveal that our method significantly outperforms alternative approaches grounded in robust statistics.
引用
收藏
页码:1396 / 1409
页数:14
相关论文
共 50 条
  • [41] FedSiM: a similarity metric federal learning mechanism based on stimulus response method with Non-IID data
    Wang, Shuangzhong
    Zhang, Ying
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2023, 34 (12)
  • [42] An Efficient Data Augmentation Network for Out-of-Distribution Image Detection
    Lin, Cheng-Hung
    Lin, Cheng-Shian
    Chou, Po-Yung
    Hsu, Chen-Chien
    IEEE ACCESS, 2021, 9 : 35313 - 35323
  • [43] FedSG: A Personalized Subgraph Federated Learning Framework on Multiple Non-IID Graphs
    Wang, Yingcheng
    Guo, Songtao
    Qiao, Dewen
    Liu, Guiyan
    Li, Mingyan
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (05): : 3678 - 3690
  • [44] Rehearsal-Free Continual Learning over Small Non-IID Batches
    Lomonaco, Vincenzo
    Maltoni, Davide
    Pellegrini, Lorenzo
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 989 - 998
  • [45] Out-of-Distribution Detection by Cross-Class Vicinity Distribution of In-Distribution Data
    Zhao, Zhilin
    Cao, Longbing
    Lin, Kun-Yu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 13777 - 13788
  • [46] Gently Sloped and Extended Classification Margin for Overconfidence Relaxation of Out-of-Distribution Samples
    Kim, Taewook
    Lee, Jong-Seok
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [47] FedCiR: Client-Invariant Representation Learning for Federated Non-IID Features
    Li, Zijian
    Lin, Zehong
    Shao, Jiawei
    Mao, Yuyi
    Zhang, Jun
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (11) : 10509 - 10522
  • [48] Adversarially-Regularized Mixed Effects Deep Learning (ARMED) Models Improve Interpretability, Performance, and Generalization on Clustered (non-iid) Data
    Nguyen, Kevin P.
    Treacher, Alex H.
    Montillo, Albert A.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (07) : 8081 - 8093
  • [49] CausPref: Causal Preference Learning for Out-of-Distribution Recommendation
    He, Yue
    Wang, Zimu
    Cui, Peng
    Zou, Hao
    Zhang, Yafeng
    Cui, Qiang
    Jiang, Yong
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 410 - 421
  • [50] FlGan: GAN-Based Unbiased Federated Learning Under Non-IID Settings
    Ma, Zhuoran
    Liu, Yang
    Miao, Yinbin
    Xu, Guowen
    Liu, Ximeng
    Ma, Jianfeng
    Deng, Robert H.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1566 - 1581