Supporting Deep Neural Network Safety Analysis and Retraining Through Heatmap-Based Unsupervised Learning

被引:24
作者
Fahmy, Hazem [1 ]
Pastore, Fabrizio [1 ]
Bagherzadeh, Mojtaba [2 ]
Briand, Lionel [1 ,2 ]
机构
[1] Univ Luxembourg, SnT Ctr Secur Reliabil & Trust, L-1855 Luxembourg, Luxembourg
[2] Univ Ottawa, Sch EECS, Ottawa, ON K1N 6N5, Canada
基金
欧洲研究理事会; 加拿大自然科学与工程研究理事会;
关键词
Heating systems; Neurons; Safety; Training; Root cause analysis; Labeling; Debugging; Deep neural network (DNN) debugging; DNN explanation; DNN functional safety analysis; heatmaps; ALGORITHMS;
D O I
10.1109/TR.2021.3074750
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks (DNNs) are increasingly important in safety-critical systems, for example, in their perception layer to analyze images. Unfortunately, there is a lack of methods to ensure the functional safety of DNN-based components. We observe three major challenges with existing practices regarding DNNs in safety-critical systems: 1) Scenarios that are under-represented in the test set may lead to serious safety violation risks but may, however, remain unnoticed; 2) characterizing such high-risk scenarios is critical for safety analysis; 3) retraining DNNs to address these risks is poorly supported when causes of violations are difficult to determine. To address these problems in the context of DNNs analyzing images, we propose heatmap-based unsupervised debugging of DNNs (HUDD), an approach that automatically supports the identification of root causes for DNN errors. HUDD identifies root causes by applying a clustering algorithm to heatmaps capturing the relevance of every DNN neuron on the DNN outcome. Also, HUDD retrains DNNs with images that are automatically selected based on their relatedness to the identified image clusters. We evaluated HUDD with DNNs from the automotive domain. HUDD was able to identify all the distinct root causes of DNN errors, thus supporting safety analysis. Also, our retraining approach has shown to be more effective at improving DNN accuracy than existing approaches.
引用
收藏
页码:1641 / 1657
页数:17
相关论文
共 71 条
[1]  
[Anonymous], 2017, ACM, DOI DOI 10.1145/3065386
[2]  
[Anonymous], 1953, Psychometrika, DOI DOI 10.1007/BF02289263
[3]  
[Anonymous], 2015, P ICLR WORKSH TRACK
[4]   A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering [J].
Arcuri, Andrea ;
Briand, Lionel .
2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2011, :1-10
[5]   Fast, Cheap, and Good: Why Animated GIFs Engage Us [J].
Bakhshi, Saeideh ;
Shamma, David A. ;
Kennedy, Lyndon ;
Song, Yale ;
de Juan, Paloma ;
Kaye, Joseph 'Jofish' .
34TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2016, 2016, :575-586
[6]  
Blender, 2020, BLEND 3D SIM REND EN
[7]   Visualizing and Quantifying Discriminative Features for Face Recognition [J].
Castanon, Gregory ;
Byrne, Jeffrey .
PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, :16-23
[8]   Wireframe-based UI Design Search through Image Autoencoder [J].
Chen, Jieshan ;
Chen, Chunyang ;
Xing, Zhenchang ;
Xia, Xin ;
Zhu, Liming ;
Grundy, John ;
Wang, Jinshui .
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2020, 29 (03)
[9]  
Dabkowski P, 2017, ADV NEUR IN, V30
[10]  
Daume III H., 2020, COURSE MACHINE LEARN