Mechanisms of human dynamic object recognition revealed by sequential deep neural networks

被引:5
作者
Sorensen, Lynn K. A. [1 ,2 ]
Bohte, Sander [3 ,4 ,5 ]
de Jong, Dorina [6 ,7 ]
Slagter, Heleen [8 ,9 ]
Scholte, H. Steven [1 ,2 ]
机构
[1] Univ Amsterdam, Dept Psychol, Amsterdam, Netherlands
[2] Univ Amsterdam, Amsterdam Brain & Cognit ABC, Amsterdam, Netherlands
[3] Ctr Wiskunde & Informat, Machine Learning Grp, Amsterdam, Netherlands
[4] Univ Amsterdam, Swammerdam Inst Life Sci SILS, Amsterdam, Netherlands
[5] Univ Groningen, Bernoulli Inst, Groningen, Netherlands
[6] Ctr Translat Neurophysiol Speech & Commun CTNSC, Ist Italiano Tecnol, Ferrara, Italy
[7] Univ Ferrara, Dipartimento Sci Biomed & Chirurg Specialist, Ferrara, Italy
[8] Vrije Univ Amsterdam, Dept Expt & Appl Psychol, Amsterdam, Netherlands
[9] Vrije Univ Amsterdam, Inst Brain & Behav Amsterdam, Amsterdam, Netherlands
基金
荷兰研究理事会;
关键词
POWER-LAW ADAPTATION; SENSORY ADAPTATION; CONCEPTUAL MASKING; MEMORY; REPRESENTATIONS; INFORMATION; MODELS; SPEED; RSVP; MS;
D O I
10.1371/journal.pcbi.1011169
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Author summaryOur visual world is both stable and dynamic: even within a single glance, a scene may change dramatically. Brains thus need to balance integration of information over time to create stable percepts with sensitivity to changes in sensory input, e.g., to rapidly recognize new objects. How do brains and, in particular, visual systems achieve this? Here, we addressed this question by having humans and different neural network models perform the same object recognition task in which sequences of images were shown in rapid or slow succession. We observed that models treating images as a continuous sequence by integrating its processing over time reproduced human performance patterns better than models processing every single image at a time. Furthermore, models equipped with sensory adaptation, a form of stimulus habituation, better recognized objects in faster sequences and more efficiently captured human behaviour. These findings show that lateral recurrence and adaptation jointly enable object recognition across a wide variety of time scales, suggesting a critical role for these mechanisms in dynamic vision. Humans can quickly recognize objects in a dynamically changing world. This ability is showcased by the fact that observers succeed at recognizing objects in rapidly changing image sequences, at up to 13 ms/image. To date, the mechanisms that govern dynamic object recognition remain poorly understood. Here, we developed deep learning models for dynamic recognition and compared different computational mechanisms, contrasting feedforward and recurrent, single-image and sequential processing as well as different forms of adaptation. We found that only models that integrate images sequentially via lateral recurrence mirrored human performance (N = 36) and were predictive of trial-by-trial responses across image durations (13-80 ms/image). Importantly, models with sequential lateral-recurrent integration also captured how human performance changes as a function of image presentation durations, with models processing images for a few time steps capturing human object recognition at shorter presentation durations and models processing images for more time steps capturing human object recognition at longer presentation durations. Furthermore, augmenting such a recurrent model with adaptation markedly improved dynamic recognition performance and accelerated its representational dynamics, thereby predicting human trial-by-trial responses using fewer processing resources. Together, these findings provide new insights into the mechanisms rendering object recognition so fast and effective in a dynamic visual world.
引用
收藏
页数:30
相关论文
共 77 条
[1]  
Abadi M., 2016, PREPRINT, DOI DOI 10.48550/ARXIV.1603.04467
[2]  
Aticky JJ, 2011, NETWORK-COMP NEURAL, V22, P4, DOI [10.1088/0954-898X/3/2/009, 10.3109/0954898X.2011.638888]
[3]   SOME INFORMATIONAL ASPECTS OF VISUAL PERCEPTION [J].
ATTNEAVE, F .
PSYCHOLOGICAL REVIEW, 1954, 61 (03) :183-193
[4]  
BARLOW H, 1989, COMP NEUR S, P54
[5]  
Barlow HB., 1961, SENS COMM, V1
[6]   A solution to the learning dilemma for recurrent networks of spiking neurons [J].
Bellec, Guillaume ;
Scherr, Franz ;
Subramoney, Anand ;
Hajek, Elias ;
Salaj, Darjan ;
Legenstein, Robert ;
Maass, Wolfgang .
NATURE COMMUNICATIONS, 2020, 11 (01)
[7]   Going in circles is the way forward: the role of recurrence in visual inference [J].
Bergen, Ruben S. van ;
Kriegeskorte, Nikolaus .
CURRENT OPINION IN NEUROBIOLOGY, 2020, 65 :176-193
[8]   The psychophysics toolbox [J].
Brainard, DH .
SPATIAL VISION, 1997, 10 (04) :433-436
[9]  
Chollet F., 2015, Keras
[10]   Adaptation towards scale-free dynamics improves cortical stimulus discrimination at the cost of reduced detection [J].
Clawson, Wesley P. ;
Wright, Nathaniel C. ;
Wessel, Ralf ;
Shew, Woodrow L. .
PLOS COMPUTATIONAL BIOLOGY, 2017, 13 (05)