Cross-Situational Learning with Reservoir Computing for Language Acquisition Modelling
被引:6
|
作者:
Juven, Alexis
论文数: 0引用数: 0
h-index: 0
机构:
INRIA Bordeaux Sud Ouest, Bordeaux, France
Bordeaux INP, LaBRI, CNRS, UMR 5800, Bordeaux, France
Univ Bordeaux, CNRS, UMR 5293, Inst Malad Neurodegenerat, Bordeaux, FranceINRIA Bordeaux Sud Ouest, Bordeaux, France
Juven, Alexis
[1
,2
,3
]
Hinaut, Xavier
论文数: 0引用数: 0
h-index: 0
机构:
INRIA Bordeaux Sud Ouest, Bordeaux, France
Bordeaux INP, LaBRI, CNRS, UMR 5800, Bordeaux, France
Univ Bordeaux, CNRS, UMR 5293, Inst Malad Neurodegenerat, Bordeaux, FranceINRIA Bordeaux Sud Ouest, Bordeaux, France
Hinaut, Xavier
[1
,2
,3
]
机构:
[1] INRIA Bordeaux Sud Ouest, Bordeaux, France
[2] Bordeaux INP, LaBRI, CNRS, UMR 5800, Bordeaux, France
[3] Univ Bordeaux, CNRS, UMR 5293, Inst Malad Neurodegenerat, Bordeaux, France
来源:
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)
|
2020年
关键词:
Recurrent Neural Networks;
Reservoir Computing;
Echo State Networks;
Language Learning;
Cross-situational Learning;
Unsupervised Learning;
Language Acquisition;
D O I:
10.1109/ijcnn48605.2020.9206650
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Understanding the mechanisms enabling children to learn rapidly word-to-meaning mapping through cross-situational learning in uncertain conditions is still a matter of debate. In particular, many models simply look at the word level, and not at the full sentence comprehension level. We present a model of language acquisition, applying cross-situational learning on Recurrent Neural Networks with the Reservoir Computing paradigm. Using the co-occurrences between words and visual perceptions, the model learns to ground a complex sentence, describing a scene involving different objects, into a perceptual representation space. The model processes sentences describing scenes it perceives simultaneously via a simulated vision module: sentences are inputs and simulated vision are target outputs of the RNN. Evaluations of the model show its capacity to extract the semantics of virtually hundred of thousands possible combinations of sentences (based on a context-free grammar); remarkably the model generalises only after a few hundred of partially described scenes via cross-situational learning. Furthermore, it handles polysemous and synonymous words, and deals with complex sentences where word order is crucial for understanding. Finally, further improvements of the model are discussed in order to reach proper reinforced and self-supervised learning schemes, with the goal to enable robots to acquire and ground language by them-selves (with no oracle supervision).