Multi-modal shared module that enables the bottom-up formation of map representation and top-down map reading

被引:0
作者
Noguchi, Wataru [1 ]
Iizuka, Hiroyuki [1 ,2 ]
Yamamoto, Masahito [1 ,2 ]
机构
[1] Hokkaido Univ, Fac Informat Sci & Technol, Sapporo, Hokkaido, Japan
[2] Hokkaido Univ, Ctr Human Nat Artificial Intelligence & Neurosci, Sapporo, Hokkaido, Japan
关键词
Cognitive map; multimodal learning; predictive learning; deep neural networks; symbol grounding; SPATIAL MAP; INTEGRATION;
D O I
10.1080/01691864.2021.1993334
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Humans create internal models of an environment (i.e. cognitive maps) through subjective sensorimotor experiences and can also understand spatial locations by looking at an external map as a symbol of an environment. We simulate the development of the cognitive map from sensorimotor experiences and grounding of the external map in a single deep neural network model. Our proposed network has a shared module that processes the features of multiple modalities (i.e. vision, hearing, and touch) and even external maps in the same manner. The multiple modalities are encoded into feature vectors by modality-specific encoders, and the encoded features are processed by the same shared module. The proposed network was trained to predict the sensory inputs of a simulated mobile robot. After the predictive learning, the spatial representation was developed in the internal states of the shared module, and the same spatial representation was used for predicting multiple modalities, including the external map. The network can also perform spatial navigation by associating the external map with the cognitive map. This implies that the external maps are grounded in subjective sensorimotor experiences, being bridged through the developed internal spatial representation in the shared module.
引用
收藏
页码:85 / 99
页数:15
相关论文
共 28 条
  • [1] Alayrac JB, 2020, ADV NEUR IN, V33
  • [2] [Anonymous], 2014, P 2 INT C LEARN REPR
  • [3] Aytar Y., 2017, ARXIV170600932
  • [4] Vector-based navigation using grid-like representations in artificial agents
    Banino, Andrea
    Barry, Caswell
    Uria, Benigno
    Blundell, Charles
    Lillicrap, Timothy
    Mirowski, Piotr
    Pritzel, Alexander
    Chadwick, Martin J.
    Degris, Thomas
    Modayil, Joseph
    Wayne, Greg
    Soyer, Hubert
    Viola, Fabio
    Zhang, Brian
    Goroshin, Ross
    Rabinowitz, Neil
    Pascanu, Razvan
    Beattie, Charlie
    Petersen, Stig
    Sadik, Amir
    Gaffney, Stephen
    King, Helen
    Kavukcuoglu, Koray
    Hassabis, Demis
    Hadsell, Raia
    Kumaran, Dharshan
    [J]. NATURE, 2018, 557 (7705) : 429 - +
  • [5] Brunner G, 2018, AAAI CONF ARTIF INTE, P2763
  • [6] Chiappa S, 2017, Recurrent environment simulators
  • [7] Whatever next? Predictive brains, situated agents, and the future of cognitive science
    Clark, Andy
    [J]. BEHAVIORAL AND BRAIN SCIENCES, 2013, 36 (03) : 181 - 204
  • [8] Simultaneous localization and mapping: Part I
    Durrant-Whyte, Hugh
    Bailey, Tim
    [J]. IEEE ROBOTICS & AUTOMATION MAGAZINE, 2006, 13 (02) : 99 - 108
  • [9] Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]
  • [10] Hybrid computing using a neural network with dynamic external memory
    Graves, Alex
    Wayne, Greg
    Eynolds, Malcolm R.
    Harley, Tim
    Danihelka, Ivo
    Grabska-Barwinska, Agnieszka
    Colmenarejo, Sergio Gomez
    Grefenstette, Edward
    Amalho, Tiago R.
    Agapiou, John
    Badia, Adria Puigdomenech
    Hermann, Karl Moritz
    Zwols, Yori
    Strovski, Georg O.
    Ain, Adam C.
    King, Helen
    Summerfield, Christopher
    Lunsom, Phil B.
    Kavukcuoglu, Koray
    Hassabis, Demis
    [J]. NATURE, 2016, 538 (7626) : 471 - +