Multi-modal shared module that enables the bottom-up formation of map representation and top-down map reading

被引：0

作者：

Noguchi, Wataru ^{[1
]}

Iizuka, Hiroyuki ^{[1
,2
]}

Yamamoto, Masahito ^{[1
,2
]}

机构：

[1] Hokkaido Univ, Fac Informat Sci & Technol, Sapporo, Hokkaido, Japan

[2] Hokkaido Univ, Ctr Human Nat Artificial Intelligence & Neurosci, Sapporo, Hokkaido, Japan

来源：

ADVANCED ROBOTICS | 2022年 / 36卷 / 1-2期

关键词：

Cognitive map; multimodal learning; predictive learning; deep neural networks; symbol grounding; SPATIAL MAP; INTEGRATION;

D O I：

10.1080/01691864.2021.1993334

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Humans create internal models of an environment (i.e. cognitive maps) through subjective sensorimotor experiences and can also understand spatial locations by looking at an external map as a symbol of an environment. We simulate the development of the cognitive map from sensorimotor experiences and grounding of the external map in a single deep neural network model. Our proposed network has a shared module that processes the features of multiple modalities (i.e. vision, hearing, and touch) and even external maps in the same manner. The multiple modalities are encoded into feature vectors by modality-specific encoders, and the encoded features are processed by the same shared module. The proposed network was trained to predict the sensory inputs of a simulated mobile robot. After the predictive learning, the spatial representation was developed in the internal states of the shared module, and the same spatial representation was used for predicting multiple modalities, including the external map. The network can also perform spatial navigation by associating the external map with the cognitive map. This implies that the external maps are grounded in subjective sensorimotor experiences, being bridged through the developed internal spatial representation in the shared module.

引用

页码：85 / 99

页数：15

共 28 条

[1] Alayrac JB, 2020, ADV NEUR IN, V33
[2] [Anonymous], 2014, P 2 INT C LEARN REPR
[3] Aytar Y., 2017, ARXIV170600932
[4] Vector-based navigation using grid-like representations in artificial agents
Banino, Andrea
Barry, Caswell
Uria, Benigno
Blundell, Charles
Lillicrap, Timothy
Mirowski, Piotr
Pritzel, Alexander
Chadwick, Martin J.
Degris, Thomas
Modayil, Joseph
Wayne, Greg
Soyer, Hubert
Viola, Fabio
Zhang, Brian
Goroshin, Ross
Rabinowitz, Neil
Pascanu, Razvan
Beattie, Charlie
Petersen, Stig
Sadik, Amir
Gaffney, Stephen
King, Helen
Kavukcuoglu, Koray
Hassabis, Demis
Hadsell, Raia
Kumaran, Dharshan
[J]. NATURE, 2018, 557 (7705) : 429 - +
[5] Brunner G, 2018, AAAI CONF ARTIF INTE, P2763
[6] Chiappa S, 2017, Recurrent environment simulators
[7] Whatever next? Predictive brains, situated agents, and the future of cognitive science
Clark, Andy
[J]. BEHAVIORAL AND BRAIN SCIENCES, 2013, 36 (03) : 181 - 204
[8] Simultaneous localization and mapping: Part I
Durrant-Whyte, Hugh
Bailey, Tim
[J]. IEEE ROBOTICS & AUTOMATION MAGAZINE, 2006, 13 (02) : 99 - 108
[9] Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]
[10] Hybrid computing using a neural network with dynamic external memory
Graves, Alex
Wayne, Greg
Eynolds, Malcolm R.
Harley, Tim
Danihelka, Ivo
Grabska-Barwinska, Agnieszka
Colmenarejo, Sergio Gomez
Grefenstette, Edward
Amalho, Tiago R.
Agapiou, John
Badia, Adria Puigdomenech
Hermann, Karl Moritz
Zwols, Yori
Strovski, Georg O.
Ain, Adam C.
King, Helen
Summerfield, Christopher
Lunsom, Phil B.
Kavukcuoglu, Koray
Hassabis, Demis
[J]. NATURE, 2016, 538 (7626) : 471 - +

← 1 2 3 →