Performance analysis of out-of-distribution detection on trained neural networks

被引：12

作者：

Henriksson, Jens ^{[1
,2
]}

Berger, Christian ^{[2
,3
]}

Borg, Markus ^{[4
,5
]}

Tornberg, Lars ^{[6
]}

Sathyamoorthy, Sankar Raman ^{[7
]}

Englund, Cristofer ^{[4
,5
]}

机构：

[1] Semcon Sweden AB, Gothenburg, Sweden

[2] Chalmers Inst Technol, Gothenburg, Sweden

[3] Univ Gothenburg, Gothenburg, Sweden

[4] RISE Res Inst Sweden AB, Lund, Sweden

[5] RISE Res Inst Sweden AB, Gothenburg, Sweden

[6] Volvo Cars, Machine Learning & AI Ctr Excellence, Gothenburg, Sweden

[7] QRTech AB, Gothenburg, Sweden

来源：

INFORMATION AND SOFTWARE TECHNOLOGY | 2021年 / 130卷

关键词：

Deep neural networks; Robustness; Out-of-distribution; Automotive perception; Safety-critical systems;

D O I：

10.1016/j.infsof.2020.106409

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Context: Deep Neural Networks (DNN) have shown great promise in various domains, for example to support pattern recognition in medical imagery. However, DNNs need to be tested for robustness before being deployed in safety critical applications. One common challenge occurs when the model is exposed to data samples outside of the training data domain, which can yield to outputs with high confidence despite no prior knowledge of the given input. Objective: The aim of this paper is to investigate how the performance of detecting out-of-distribution (OOD) samples changes for outlier detection methods (e.g., supervisors) when DNNs become better on training samples. Method: Supervisors are components aiming at detecting out-of-distribution samples for a DNN. The experimental setup in this work compares the performance of supervisors using metrics and datasets that reflect the most common setups in related works. Four different DNNs with three different supervisors are compared during different stages of training, to detect at what point during training the performance of the supervisors begins to deteriorate. Results: Found that the outlier detection performance of the supervisors increased as the accuracy of the underlying DNN improved. However, all supervisors showed a large variation in performance, even for variations of network parameters that marginally changed the model accuracy. The results showed that understanding the relationship between training results and supervisor performance is crucial to improve a model's robustness. Conclusion: Analyzing DNNs for robustness is a challenging task. Results showed that variations in model parameters that have small variations on model predictions can have a large impact on the out-of-distribution detection performance. This kind of behavior needs to be addressed when DNNs are part of a safety critical application and hence, the necessary safety argumentation for such systems need be structured accordingly.

引用

页数：12

共 29 条

[1]

[Anonymous], 2019, ISO/PAS 21448:2019

[2] Towards Open Set Deep Networks [J].

Bendale, Abhijit ;

Boult, Terrance E. .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1563-1572

[3]

Borg M., 2019, J AUTOM SOFTW ENG

[4] Towards Evaluating the Robustness of Neural Networks [J].

Carlini, Nicholas ;

Wagner, David .

2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, :39-57

[5]

Czarnecki K, 2017, arXiv

[6] Towards a Framework to Manage Perceptual Uncertainty for Safe Automated Driving [J].

Czarnecki, Krzysztof ;

Salay, Rick .

COMPUTER SAFETY, RELIABILITY, AND SECURITY, SAFECOMP 2018, 2018, 11094 :439-445

[7] DLFuzz: Differential Fuzzing Testing of Deep Learning Systems [J].

Guo, Jianmin ;

Jiang, Yu ;

Zhao, Yue ;

Chen, Quan ;

Sun, Jiaguang .

ESEC/FSE'18: PROCEEDINGS OF THE 2018 26TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2018, :739-743

[8]

He K., 2016, P 29 IEEE C COMPUTER, P770

[9]

Hendrycks D., 2018, ARXIV181204606

[10]

Hendrycks Dan, 2017, P 5 INT C LEARN REPR, DOI DOI 10.48550/ARXIV.1610.02136

← 1 2 3 →