Bootstrapping language acquisition

被引:47
作者
Abend, Omri [1 ,2 ,3 ]
Kwiatkowski, Tom [1 ,4 ]
Smith, Nathaniel J. [1 ,5 ]
Goldwater, Sharon [1 ]
Steedman, Mark [1 ]
机构
[1] Univ Edinburgh, Informat, Edinburgh, Midlothian, Scotland
[2] Hebrew Univ Jerusalem, Dept Comp Sci, Jerusalem, Israel
[3] Hebrew Univ Jerusalem, Dept Cognit Sci, Jerusalem, Israel
[4] Google Res, Mountain View, CA USA
[5] Univ Calif Berkeley, Berkeley Inst Data Sci, Berkeley, CA 94720 USA
关键词
Language acquisition; Syntactic bootstrapping; Semantic bootstrapping; Computational modeling; Bayesian model; Cross-situational learning; COMPUTATIONAL MODEL; GRAMMAR; WORDS; CUE; INFORMATION; EMERGENCE; MECHANISM; SELECTION; TURKISH; SYNTAX;
D O I
10.1016/j.cognition.2017.02.009
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
The semantic bootstrapping hypothesis proposes that children acquire their native language through exposure to sentences of the language paired with structured representations of their meaning, whose component substructures can be associated with words and syntactic structures used to express these concepts. The child's task is then to learn a language-specific grammar and lexicon based on (probably contextually ambiguous, possibly somewhat noisy) pairs of sentences and their meaning representations (logical forms). Starting from these assumptions, we develop a Bayesian probabilistic account of semantically bootstrapped first-language acquisition in the child, based on techniques from computational parsing and interpretation of unrestricted text. Our learner jointly models (a) word learning: the mapping between components of the given sentential meaning and lexical words (or phrases) of the language, and (b) syntax learning: the projection of lexical elements onto sentences by universal construction-free syntactic rules. Using an incremental learning algorithm, we apply the model to a dataset of real syntactically complex child-directed utterances and (pseudo) logical forms, the latter including contextually plausible but irrelevant distractors. Taking the Eve section of the CHILDES corpus as input, the model simulates several well-documented phenomena from the developmental literature. In particular, the model exhibits syntactic bootstrapping effects (in which previously learned constructions facilitate the learning of novel words), sudden jumps in learning without explicit parameter setting, acceleration of word-learning (the "vocabulary spurt"), an initial bias favoring the learning of nouns over verbs, and one-shot learning of words and their meanings. The learner thus demonstrates how statistical learning over structured representations can provide a unified account for these seemingly disparate phenomena. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:116 / 143
页数:28
相关论文
共 172 条
[61]  
Calhoun S, 2010, LANGUAGE, V86, P1
[62]   Function Words Constrain On-Line Recognition of Verbs and Nouns in French 18-Month-Olds [J].
Cauvet, Elodie ;
Limissuri, Rita ;
Millotte, Severine ;
Skoruppa, Katrin ;
Cabrol, Dominique ;
Christophe, Anne .
LANGUAGE LEARNING AND DEVELOPMENT, 2014, 10 (01) :1-18
[63]  
Charniak Eugene., 1997, AAAI/IAAI, V2005, P18
[64]  
CHOMSKY Noam, 1981, LECT GOVT BINDING
[65]  
Chomsky Noam, 1965, Aspects of the theory of syntax
[66]  
Chomsky Noam., 1986, Knowledge of Language
[67]  
Christodoulopoulos Christos, 2010, P 2010 C EMPIRICAL M, P575
[68]  
Clark E. V., 1973, Cognitive development and the acquisition of language, P65, DOI DOI 10.1016/B978-0-12-505850-6.50009-8
[69]  
Clark Stephen., 2004, Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), P104
[70]  
Cohn T, 2010, J MACH LEARN RES, V11, P3053