HOW CONVOLUTIONAL NEURAL NETWORKS SEE THE WORLD - A SURVEY OF CONVOLUTIONAL NEURAL NETWORK VISUALIZATION METHODS

被引:100
作者
Qin, Zhuwei [1 ]
Yu, Fuxun [1 ]
Liu, Chenchen [2 ]
Chen, Xiang [1 ]
机构
[1] George Mason Univ, 4400 Univ Dr, Fairfax, VA 22030 USA
[2] Clarkson Univ, 8 Clarkson Ave, Potsdam, NY 13699 USA
来源
MATHEMATICAL FOUNDATIONS OF COMPUTING | 2018年 / 1卷 / 02期
关键词
Deep learning; convolutional neural network; CNN feature; CNN visualization; network interpretability;
D O I
10.3934/mfc.2018008
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nowadays, the Convolutional Neural Networks (CNNs) have achieved impressive performance on many computer vision related tasks, such as object detection, image recognition, image retrieval, etc. These achievements benefit from the CNNs' outstanding capability to learn the input features with deep layers of neuron structures and iterative training process. However, these learned features are hard to identify and interpret from a human vision perspective, causing a lack of understanding of the CNNs' internal working mechanism. To improve the CNN interpretability, the CNN visualization is well utilized as a qualitative analysis method, which translates the internal features into visually perceptible patterns. And many CNN visualization works have been proposed in the literature to interpret the CNN in perspectives of network structure, operation, and semantic concept. In this paper, we expect to provide a comprehensive survey of several representative CNN visualization methods, including Activation Maximization, Network Inversion, Deconvoluttonal Neural Networks (DeconvNet), and Network Dissection based visualization. These methods are presented in terms of motivations, algorithms, and experiment results. Based on these visualization methods, we also discuss their practical applications to demonstrate the significance of the CNN interpretability in areas of network design, optimization, security enhancement, etc.
引用
收藏
页码:149 / 180
页数:32
相关论文
共 73 条
[1]  
Agrawal P, 2014, LECT NOTES COMPUT SC, V8695, P329, DOI 10.1007/978-3-319-10584-0_22
[2]  
Andrew Zisserman, 2015, Arxiv, DOI arXiv:1409.1556
[3]  
Nguyen A, 2016, ADV NEUR IN, V29
[4]  
Nguyen A, 2015, PROC CVPR IEEE, P427, DOI 10.1109/CVPR.2015.7298640
[5]   Network Dissection: Quantifying Interpretability of Deep Visual Representations [J].
Bau, David ;
Zhou, Bolei ;
Khosla, Aditya ;
Oliva, Aude ;
Torralba, Antonio .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3319-3327
[6]  
Chintala, ARXIV170107875
[7]  
Ciresan D.C., 2011, P INT JOINT C ARTIFI
[8]  
Collobert R., 2011, WORKSH BIGLEARN NIPS
[9]  
Csurka G., 2004, WORKSHOP STAT LEARNI, P1
[10]  
d'Angelo E, 2012, INT C PATT RECOG, P935