Online perceptual learning and natural language acquisition for autonomous robots

被引：6

作者：

Alomari, Muhannad ^{[1
]}

Li, Fangjun ^{[1
]}

Hogg, David C. ^{[1
]}

Cohn, Anthony G. ^{[1
,2
,3
,4
]}

机构：

[1] Univ Leeds, Sch Comp, Leeds, W Yorkshire, England

[2] Qingdao Univ Sci & Technol, Sch Mech & Elect Engn, Luzhong Inst Safety Environm Protect Engeering &, Qingdao, Peoples R China

[3] Tongji Univ, Coll Elect & Informat Engn, Shanghai, Peoples R China

[4] Shandong Univ, Sch Civil Engn, Jinan, Peoples R China

来源：

ARTIFICIAL INTELLIGENCE | 2022年 / 303卷

关键词：

Language and vision; Language acquisition; Language grounding; Grammar induction; MODELS;

D O I：

10.1016/j.artint.2021.103637

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, the problem of bootstrapping knowledge in language and vision for autonomous robots is addressed through novel techniques in grammar induction and word grounding to the perceptual world. In particular, we demonstrate a system, called OLAV, which is able, for the first time, to (1) learn to form discrete concepts from sensory data; (2) ground language (n-grams) to these concepts; (3) induce a grammar for the language being used to describe the perceptual world; and moreover to do all this incrementally, without storing all previous data. The learning is achieved in a loosely-supervised manner from raw linguistic and visual data. Moreover, the learnt model is transparent, rather than a black-box model and is thus open to human inspection. The visual data is collected using three different robotic platforms deployed in real-world and simulated environments and equipped with different sensing modalities, while the linguistic data is collected using online crowdsourcing tools and volunteers. The analysis performed on these robots demonstrates the effectiveness of the framework in learning visual concepts, language groundings and grammatical structure in these three online settings. (c) 2021 Published by Elsevier B.V.

引用

页数：32

共 81 条

[1]

Abney S., 1996, Natural Language Engineering, V2, P337, DOI 10.1017/S1351324997001599

[2]

Al-omari M, 2017, THESIS U LEEDS SCH C

[3] Unsupervised Learning from Narrated Instruction Videos [J].

Alayrac, Jean-Baptiste ;

Bojanowski, Piotr ;

Agrawal, Nishant ;

Sivic, Josef ;

Laptev, Ivan ;

Lacoste-Julien, Simon .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4575-4583

[4] MAINTAINING KNOWLEDGE ABOUT TEMPORAL INTERVALS [J].

ALLEN, JF .

COMMUNICATIONS OF THE ACM, 1983, 26 (11) :832-843

[5]

Alomari M., 2016, WORKSH COGN KNOWL AC

[6]

Alomari M., 2017, P INT MULT SENS OBJ

[7]

Alomari M, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1395

[8]

Alomari M, 2017, AAAI CONF ARTIF INTE, P4349

[9]

Alomari M, 2016, FIFTEENTH INTERNATIONAL CONFERENCE ON THE PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING, P505

[10]

[Anonymous], 2000, Evolution of comm., DOI DOI 10.1075/EOC.4.1.03STE

← 1 2 3 4 5 6 7 8 9 →