Weighting non-IID batches for out-of-distribution detection

被引：0

作者：

Zhao, Zhilin ^{[1
,2
]}

Cao, Longbing ^{[1
,2
]}

机构：

[1] Macquarie Univ, Sch Comp, Data Sci Lab, Sydney, NSW 2109, Australia

[2] Macquarie Univ, DataX Res Ctr, Sydney, NSW 2109, Australia

来源：

MACHINE LEARNING | 2024年 / 113卷 / 10期

基金：

澳大利亚研究理事会;

关键词：

Non-IID; Out-of-distribution detection; Dataset discrepancy;

D O I：

10.1007/s10994-024-06605-z

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A standard network pretrained on in-distribution (ID) samples could make high-confidence predictions on out-of-distribution (OOD) samples, leaving the possibility of failing to distinguish ID and OOD samples in the test phase. To address this over-confidence issue, the existing methods improve the OOD sensitivity from modeling perspectives, i.e., retraining it by modifying training processes or objective functions. In contrast, this paper proposes a simple but effective method, namely Weighted Non-IID Batching (WNB), by adjusting batch weights. WNB builds on a key observation: increasing the batch size can improve the OOD detection performance. This is because a smaller batch size may make its batch samples more likely to be treated as non-IID from the assumed ID, i.e., associated with an OOD. This causes a network to provide high-confidence predictions for all samples from the OOD. Accordingly, WNB applies a weight function to weight each batch according to the discrepancy between batch samples and the entire training ID dataset. Specifically, the weight function is derived by minimizing the generalization error bound. It ensures that the weight function assigns larger weights to batches with smaller discrepancies and makes a trade-off between ID classification and OOD detection performance. Experimental results show that incorporating WNB into state-of-the-art OOD detection methods can further improve their performance.

引用

页码：7371 / 7391

页数：21

共 54 条

[1] Amodei D., 2016, ARXIV
[2] Bardenet R, 2017, J MACH LEARN RES, V18, P1
[3] Bartlett P. L., 2003, Journal of Machine Learning Research, V3, P463, DOI 10.1162/153244303321897690
[4] A theory of learning from different domains
Ben-David, Shai
Blitzer, John
Crammer, Koby
Kulesza, Alex
Pereira, Fernando
Vaughan, Jennifer Wortman
[J]. MACHINE LEARNING, 2010, 79 (1-2) : 151 - 175
[5] Rademacher Complexity for Enhancing the Generalization of Genetic Programming for Symbolic Regression
Chen, Qi
Xue, Bing
Zhang, Mengjie
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (04) : 2382 - 2395
[6] Describing Textures in the Wild
Cimpoi, Mircea
Maji, Subhransu
Kokkinos, Iasonas
Mohamed, Sammy
Vedaldi, Andrea
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3606 - 3613
[7] Davis JS, 2006, PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON THE ECOLOGICAL IMPORTANCE OF SOLAR SALTWORKS, P5
[8] Djurisic A., 2023, 11 INT C LEARN REPR, P1
[9] Golowich N., 2018, PMLR, P297
[10] Goodfellow I. J., 2015, 3 INT C LEARN REPR I

← 1 2 3 4 5 6 →