Rethinking Out-of-Distribution Detection From a Human-Centric Perspective

被引:3
作者
Zhu, Yao [1 ]
Chen, Yuefeng [2 ]
Li, Xiaodan [2 ]
Zhang, Rong [2 ]
Xue, Hui [2 ]
Tian, Xiang [1 ,4 ]
Jiang, Rongxin [1 ,4 ]
Zheng, Bolun [3 ,4 ]
Chen, Yaowu [1 ,5 ]
机构
[1] Zhejiang Univ, Hangzhou 310027, Zhejiang, Peoples R China
[2] Alibaba Grp, Secur Dept, Hangzhou 310023, Zhejiang, Peoples R China
[3] Hangzhou Dianzi Univ, Hangzhou 310018, Zhejiang, Peoples R China
[4] Zhejiang Prov Key Lab Network Multimedia Technol, Hangzhou 310027, Zhejiang, Peoples R China
[5] Zhejiang Univ, Minist Educ China, Embedded Syst Engn Res Ctr, Hangzhou 310027, Zhejiang, Peoples R China
关键词
Out-of-Distribution detection; AI reliability; Image classification;
D O I
10.1007/s11263-024-02099-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Out-Of-Distribution (OOD) detection has received broad attention over the years, aiming to ensure the reliability and safety of deep neural networks (DNNs) in real-world scenarios by rejecting incorrect predictions. However, we notice a discrepancy between the conventional evaluation vs. the essential purpose of OOD detection. On the one hand, the conventional evaluation exclusively considers risks caused by label-space distribution shifts while ignoring the risks from input-space distribution shifts. On the other hand, the conventional evaluation reward detection methods for not rejecting the misclassified image in the validation dataset. However, the misclassified image can also cause risks and should be rejected. We appeal to rethink OOD detection from a human-centric perspective, that a proper detection method should reject the case that the deep model's prediction mismatches the human expectations and adopt the case that the deep model's prediction meets the human expectations. We propose a human-centric evaluation and conduct extensive experiments on 45 classifiers and 8 test datasets. We find that the simple baseline OOD detection method can achieve comparable and even better performance than the recently proposed methods, which means that the development in OOD detection in the past years may be overestimated. Additionally, our experiments demonstrate that model selection is non-trivial for OOD detection and should be considered as an integral of the proposed method, which differs from the claim in existing works that proposed methods are universal across different models.
引用
收藏
页码:4633 / 4650
页数:18
相关论文
共 55 条
[1]   ATOM: Robustifying Out-of-Distribution Detection Using Outlier Mining [J].
Chen, Jiefeng ;
Li, Yixuan ;
Wu, Xi ;
Liang, Yingyu ;
Jha, Somesh .
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT III, 2021, 12977 :430-445
[2]   Describing Textures in the Wild [J].
Cimpoi, Mircea ;
Maji, Subhransu ;
Kokkinos, Iasonas ;
Mohamed, Sammy ;
Vedaldi, Andrea .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3606-3613
[3]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[4]  
Dosovitskiy A., 2021, 9 INT C LEARN REPR I
[5]  
Galesso Silvio, 2022, ARXIV
[6]  
Geirhos Robert, 2019, INT C LEARN REPR ICL
[7]  
Guo CA, 2017, PR MACH LEARN RES, V70
[8]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[9]   Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem [J].
Hein, Matthias ;
Andriushchenko, Maksym ;
Bitterwolf, Julian .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :41-50
[10]  
Hendrycks D., 2020, P INT C LEARN REPR I