Deep learning models fail to capture the configural nature of human shape perception

被引：28

作者：

Baker, Nicholas ^{[1
]}

Elder, James H. ^{[2
]}

机构：

[1] Loyola Univ, Dept Psychol, Chicago, IL 60660 USA

[2] York Univ, Ctr Vis Res, Toronto, ON M3J 1P3, Canada

来源：

ISCIENCE | 2022年 / 25卷 / 09期

基金：

加拿大自然科学与工程研究理事会;

关键词：

GESTALT PSYCHOLOGY; LOCAL FEATURES; FACE; INFORMATION; WHOLES; PARTS; CLASSIFICATION; RECOGNITION; INVERSION;

D O I：

10.1016/j.isci.2022.104913

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

A hallmark of human object perception is sensitivity to the holistic configuration of the local shape features of an object. Deep convolutional neural networks (DCNNs) are currently the dominant models for object recognition processing in the visual cortex, but do they capture this configural sensitivity? To answer this question, we employed a dataset of animal silhouettes and created a variant of this dataset that disrupts the configuration of each object while preserving local features. While human performance was impacted by this manipulation, DCNN performance was not, indicating insensitivity to object configuration. Modifications to training and architecture to make networks more brain-like did not lead to configural processing, and none of the networkswere able to accurately predict trial-by-trial human object judgements. We speculate that tomatch human configural sensitivity, networks must be trained to solve a broader range of object tasks beyond category recognition.

引用

页数：16

共 70 条

[1] Local features and global shape information in object classification by deep convolutional neural networks [J].