A brain-inspired object-based attention network for multiobject recognition and visual reasoning

被引:5
作者
Adeli, Hossein [1 ]
Ahn, Seoyoung [1 ]
Zelinsky, Gregory J. [1 ,2 ]
机构
[1] SUNY Stony Brook, Dept Psychol, Stony Brook, NY USA
[2] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY USA
来源
JOURNAL OF VISION | 2023年 / 23卷 / 05期
关键词
CONVOLUTIONAL NEURAL-NETWORKS; ZOOM LENS; PERCEPTION; MODEL; MECHANISMS; GRADIENT; TASK;
D O I
10.1167/jov.23.5.16
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
The visual system uses sequences of selective glimpses to objects to support goal-directed behavior, but how is this attention control learned? Here we present an encoder-decoder model inspired by the interacting bottom-up and top-down visual pathways making up the recognition-attention system in the brain. At every iteration, a new glimpse is taken from the image and is processed through the "what" encoder, a hierarchy of feedforward, recurrent, and capsule layers, to obtain an object-centric (object-file) representation. This representation feeds to the "where" decoder, where the evolving recurrent representation provides top-down attentional modulation to plan subsequent glimpses and impact routing in the encoder. We demonstrate how the attention mechanism significantly improves the accuracy of classifying highly overlapping digits. In a visual reasoning task requiring comparison of two objects, our model achieves near-perfect accuracy and significantly outperforms larger models in generalizing to unseen stimuli. Our work demonstrates the benefits of object-based attention mechanisms taking sequential glimpses of objects.
引用
收藏
页数:17
相关论文
共 96 条
  • [1] Deep-BCN: Deep networks meet biased competition to create a brain-inspired model of attention control
    Adeli, Hossein
    Zelinsky, Gregory
    [J]. PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 2013 - 2023
  • [2] A Model of the Superior Colliculus Predicts Fixation Locations during Scene Viewing and Visual Search
    Adeli, Hossein
    Vitu, Francoise
    Zelinsky, Gregory J.
    [J]. JOURNAL OF NEUROSCIENCE, 2017, 37 (06) : 1453 - 1467
  • [3] Reconstructing feedback representations in the ventral visual pathway with a generative adversarial autoencoder
    Al-Tahan, Haider
    Mohsenzadeh, Yalda
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (03)
  • [4] Ba J., 2015, ICLR (Poster)
  • [5] Bakhtiari Shahab, 2021, Advances in Neural Information Processing Systems
  • [6] Neural Mechanisms of Object-Based Attention
    Baldauf, Daniel
    Desimone, Robert
    [J]. SCIENCE, 2014, 344 (6182) : 424 - 427
  • [7] Going in circles is the way forward: the role of recurrence in visual inference
    Bergen, Ruben S. van
    Kriegeskorte, Nikolaus
    [J]. CURRENT OPINION IN NEUROBIOLOGY, 2020, 65 : 176 - 193
  • [8] Attention, Intention, and Priority in the Parietal Lobe
    Bisley, James W.
    Goldberg, Michael E.
    [J]. ANNUAL REVIEW OF NEUROSCIENCE, VOL 33, 2010, 33 : 1 - 21
  • [9] Generative Feedback Explains Distinct Brain Activity Codes for Seen and Mental Images
    Breedlove, Jesse L.
    St-Yves, Ghislain
    Olman, Cheryl A.
    Naselaris, Thomas
    [J]. CURRENT BIOLOGY, 2020, 30 (12) : 2211 - +
  • [10] Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition
    Cadieu, Charles F.
    Hong, Ha
    Yamins, Daniel L. K.
    Pinto, Nicolas
    Ardila, Diego
    Solomon, Ethan A.
    Majaj, Najib J.
    DiCarlo, James J.
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (12)