Where Can We Help? A Visual Analytics Approach to Diagnosing and Improving Semantic Segmentation of Movable Objects

被引:26
作者
He, Wenbin [1 ]
Zou, Lincan [1 ]
Shekar, Arvind Kumar [2 ]
Gou, Liang [1 ]
Ren, Liu [1 ]
机构
[1] Robert Bosch Res & Technol Ctr, Cambridge, MA 02139 USA
[2] Robert Bosch GmbH, Gerlingen, Germany
关键词
Semantics; Image segmentation; Autonomous vehicles; Analytical models; Robustness; Visual analytics; Data models; Model diagnosis; semantic segmentation; spatial representation learning; adversarial learning; autonomous driving;
D O I
10.1109/TVCG.2021.3114855
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Semantic segmentation is a critical component in autonomous driving and has to be thoroughly evaluated due to safety concerns. Deep neural network (DNN) based semantic segmentation models are widely used in autonomous driving. However, it is challenging to evaluate DNN-based models due to their black-box-like nature, and it is even more difficult to assess model performance for crucial objects, such as lost cargos and pedestrians, in autonomous driving applications. In this work, we propose VASS, a Visual Analytics approach to diagnosing and improving the accuracy and robustness of Semantic Segmentation models, especially for critical objects moving in various driving scenes. The key component of our approach is a context-aware spatial representation learning that extracts important spatial information of objects, such as position, size, and aspect ratio, with respect to given scene contexts. Based on this spatial representation, we first use it to create visual summarization to analyze models' performance. We then use it to guide the generation of adversarial examples to evaluate models' spatial robustness and obtain actionable insights. We demonstrate the effectiveness of VASS via two case studies of lost cargo detection and pedestrian detection in autonomous driving. For both cases, we show quantitative evaluation on the improvement of models' performance with actionable insights obtained from VASS.
引用
收藏
页码:1040 / 1050
页数:11
相关论文
共 52 条
[1]   Visual Methods for Analyzing Probabilistic Classification Data [J].
Alsallakh, Bilal ;
Hanbury, Allan ;
Hauser, Helwig ;
Miksch, Silvia ;
Rauber, Andreas .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2014, 20 (12) :1703-1712
[2]  
[Anonymous], 2016, P INT C LEARN REPR W
[3]  
[Anonymous], 2016, P EG VIS COMP BIOL M
[4]  
[Anonymous], 2014, Intriguing properties of neural networks
[5]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[6]  
Burgess Christopher P, 2018, Understanding disentangling in -VAE
[7]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[8]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[9]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[10]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223