Models of vision need some action

被引:1
作者
Rothkopf, Constantin [1 ,2 ,3 ,4 ,5 ]
Bremmer, Frank [3 ,4 ,5 ,6 ]
Fiehler, Katja [3 ,4 ,5 ,7 ]
Dobs, Katharina [3 ,4 ,5 ,7 ]
Triesch, Jochen [2 ,3 ,4 ,5 ]
机构
[1] Tech Univ Darmstadt, Ctr Cognit Sci, Darmstadt, Germany
[2] Goethe Univ Frankfurt, Frankfurt Inst Adv Studies, Frankfurt, Germany
[3] Univ Marburg, Ctr Mind Brain & Behav, Giessen, Germany
[4] Justus Liebig Univ Giessen, Giessen, Germany
[5] HMWK Clusterproject Adapt Mind, Hessen, Germany
[6] Univ Marburg, Appl Phys & Neurophys, Marburg, Germany
[7] Justus Liebig Univ Giessen, Expt Psychol, Giessen, Germany
关键词
Brain-Score; computational neuroscience; deep neural networks; human vision; object recognition;
D O I
10.1017/S0140525X23001577
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Deep neural networks (DNNs) have had extraordinary successes in classifying photographic images of objects and are often described as the best models of biological vision. This conclusion is largely based on three sets of findings: (1) DNNs are more accurate than any other model in classifying images taken from various datasets, (2) DNNs do the best job in predicting the pattern of human errors in classifying objects taken from various behavioral datasets, and (3) DNNs do the best job in predicting brain signals in response to images taken from various brain datasets (e.g., single cell responses or fMRI data). However, these behavioral and brain datasets do not test hypotheses regarding what features are contributing to good predictions and we show that the predictions may be mediated by DNNs that share little overlap with biological vision. More problematically, we show that DNNs account for almost no results from psychological research. This contradicts the common claim that DNNs are good, let alone the best, models of human object recognition. We argue that theorists interested in developing biologically plausible models of human vision need to direct their attention to explaining psychological findings. More generally, theorists need to build models that explain the results of experiments that manipulate independent variables designed to test hypotheses rather than compete on making the best predictions. We conclude by briefly summarizing various promising modeling approaches that focus on psychological data.
引用
收藏
页数:77
相关论文
共 22 条
[1]   Actor-Critic Instance Segmentation [J].
Araslanov, Nikita ;
Rothkopf, Constantin A. ;
Roth, Stefan .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8229-8238
[2]   The where, what, and how of object recognition [J].
Ayzenberg, Vladislav ;
Behrmann, Marlene .
TRENDS IN COGNITIVE SCIENCES, 2023, 27 (04) :335-336
[3]   Seeing and acting at the same time: Challenges for brain (and) research [J].
Bremmer, F ;
Krekelberg, B .
NEURON, 2003, 38 (03) :367-370
[4]   Heading representations in primates are compressed by saccades [J].
Bremmer, Frank ;
Churan, Jan ;
Lappe, Markus .
NATURE COMMUNICATIONS, 2017, 8
[5]   Use and Usefulness of Dynamic Face Stimuli for Face Perception Studies-a Review of Behavioral Findings and Methodology [J].
Dobs, Katharina ;
Buelthoff, Isabelle ;
Schultz, Johannes .
FRONTIERS IN PSYCHOLOGY, 2018, 9
[6]   Unveiling functions of the visual cortex using task-specific deep neural networks [J].
Dwivedi, Kshitij ;
Bonner, Michael F. ;
Cichy, Radoslaw Martin ;
Roig, Gemma .
PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (08)
[7]   Active efficient coding explains the development of binocular vision and its failure in amblyopia [J].
Eckmann, Samuel ;
Klimmasch, Lukas ;
Shi, Bertram E. ;
Triesch, Jochen .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (11) :6156-6162
[8]   Spatial coding for action across spatial scales [J].
Fiehler, Katja ;
Karimpur, Harun .
NATURE REVIEWS PSYCHOLOGY, 2023, 2 (02) :72-84
[9]   Prediction in goal-directed action [J].
Fiehler, Katja ;
Brenner, Eli ;
Spering, Miriam .
JOURNAL OF VISION, 2019, 19 (09)
[10]  
Jiahui G., 2022, bioRxiv, p2021.11.17.469009