Computational learning of construction grammars

被引:20
作者
Dunn, Jonathan [1 ]
机构
[1] IIT, Dept Comp Sci, Chicago, IL 60616 USA
关键词
construction grammar; grammar induction; multi-unit association measures; poverty of the stimulus; ENGLISH;
D O I
10.1017/langcog.2016.7
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper presents an algorithm for learning the construction grammar of a language from a large corpus. This grammar induction algorithm has two goals: first, to show that construction grammars are learnable without highly specified innate structure; second, to develop a model of which units do or do not constitute constructions in a given dataset. The basic task of construction grammar induction is to identify the minimum set of constructions that represents the language in question with maximum descriptive adequacy. These constructions must (1) generalize across an unspecified number of units while (2) containing mixed levels of representation internally (e.g., both item-specific and schematized representations), and (3) allowing for unfilled and partially filled slots. Additionally, these constructions may (4) contain recursive structure within a given slot that needs to be reduced in order to produce a sufficiently schematic representation. In other words, these constructions are multi-length, multi-level, possibly discontinuous co-occurrences which generalize across internal recursive structures. These co-occurrences are modeled using frequency and the Delta P measure of association, expanded in novel ways to cover multi-unit sequences. This work provides important new evidence for the learnability of construction grammars as well as a tool for the automated corpus analysis of constructions.
引用
收藏
页码:254 / 292
页数:39
相关论文
共 65 条
[11]   From usage to grammar: The mind's response to repetition [J].
Bybee, Joan .
LANGUAGE, 2006, 82 (04) :711-733
[12]  
Chang Nancy, 2012, Computational Issues in Fluid Construction Grammar. A New Formalism for the Representation of Lexicons and Grammars: LNCS 7249, P259, DOI 10.1007/978-3-642-34120-5_11
[13]  
Chomsky Noam, 1965, Aspects of the theory of syntax
[14]  
Chomsky Noam., 1955, The logical structure of linguistic theory
[15]  
da Silva J.F., 1999, Sixth Meeting on Mathematics of Language, P369
[16]  
Daudaraviius V, 2004, INT J CORPUS LINGUIS, V9, P321, DOI [10.1075/ijcl.9.2.08dau, DOI 10.1075/IJCL.9.2.08DAU]
[17]   The Corpus of Contemporary American English as the first reliable monitor corpus of English [J].
Davies, Mark .
LITERARY AND LINGUISTIC COMPUTING, 2010, 25 (04) :447-464
[18]  
Dennis S., 2005, P COGSCI 2005 AUST T, P583
[19]   The Semantic Representation of Natural Language [J].
Dunn, Jonathan .
STUDIES IN LANGUAGE, 2015, 39 (02) :492-500
[20]  
Fillmore CJ., 1988, Berkeley Linguistic Society, V14, P35, DOI [DOI 10.3765/BLS.V14I0.1794, 10.3765/bls.v14i0.1794]