Using Psychophysical Methods to Understand Mechanisms of Face Identification in a Deep Neural Network

被引:3
作者
Xu, Tian [1 ]
Garrod, Oliver [1 ]
Scholte, Steven H. [2 ]
Ince, Robin [1 ]
Schyns, Philippe G. [1 ]
机构
[1] Univ Glasgow, Glasgow, Lanark, Scotland
[2] Univ Amsterdam, Amsterdam, Netherlands
来源
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW) | 2018年
基金
英国惠康基金; 英国工程与自然科学研究理事会;
关键词
INFORMATION; RECOGNITION; FEATURES;
D O I
10.1109/CVPRW.2018.00266
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep Convolutional Neural Networks (CNNs) have been one of the most influential recent developments in computer vision, particularly for categorization [20]. The promise of CNNs is at least two-fold. First, they represent the best engineering solution to successfully tackle the foundational task of visual categorization with a performance level that even exceeds that of humans [19, 27]. Second, for computational neuroscience, CNNs provide a testable modelling platform for visual categorizations inspired by the multilayered organization of visual cortex [7]. Here, we used a 3D generative model to control the variance of information learned to identify 2,000 face identities in one CNN architecture (10-layer ResNet [9]). We generated 25M face images to train the network by randomly sampling intrinsic (i.e. face morphology, gender, age, expression and ethnicity) and extrinsic factors of face variance (i.e. 3D pose, illumination, scale and 2D translation). At testing, the network performed with 99% generalization accuracy for face identity across variations of intrinsic and extrinsic factors. State-of-the-art information mapping techniques from psychophysics (i.e. Representational Similarity Analysis [18] and Bubbles [8]) revealed respectively the network layer at which factors of variance are resolved and the face features that are used for identity. By explicitly controlling the generative factors of face information, we provide an alternative framework based on human psychophysics to understand information processing in CNNs.
引用
收藏
页码:2057 / 2065
页数:9
相关论文
共 43 条
[1]  
[Anonymous], 2016, PROC CVPR IEEE, DOI [DOI 10.1109/CVPR.2016.319, 10.1109/CVPR.2016.319]
[2]  
[Anonymous], 2007, Tech. rep
[3]   PSYCHOPHYSICAL SUPPORT FOR A 2-DIMENSIONAL VIEW INTERPOLATION THEORY OF OBJECT RECOGNITION [J].
BULTHOFF, HH ;
EDELMAN, S .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (01) :60-64
[4]   THE LAPLACIAN PYRAMID AS A COMPACT IMAGE CODE [J].
BURT, PJ ;
ADELSON, EH .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1983, 31 (04) :532-540
[5]   Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition [J].
Cadieu, Charles F. ;
Hong, Ha ;
Yamins, Daniel L. K. ;
Pinto, Nicolas ;
Ardila, Diego ;
Solomon, Ethan A. ;
Majaj, Najib J. ;
DiCarlo, James J. .
PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (12)
[6]   Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence [J].
Cichy, Radoslaw Martin ;
Khosla, Aditya ;
Pantazis, Dimitrios ;
Torralba, Antonio ;
Oliva, Aude .
SCIENTIFIC REPORTS, 2016, 6
[7]  
Cybenko G., 1989, Mathematics of Control, Signals, and Systems, V2, P303, DOI 10.1007/BF02551274
[8]  
Duda R.O., 1995, Pattern Classification and Scene Analysis, Vsecond
[9]   NEOCOGNITRON - A HIERARCHICAL NEURAL NETWORK CAPABLE OF VISUAL-PATTERN RECOGNITION [J].
FUKUSHIMA, K .
NEURAL NETWORKS, 1988, 1 (02) :119-130
[10]   Bubbles: a technique to reveal the use of information in recognition tasks [J].
Gosselin, F ;
Schyns, PG .
VISION RESEARCH, 2001, 41 (17) :2261-2271