Coreference in English OntoNotes: Properties and Genre Differences

被引:1
作者
Aktas, Berfin [1 ]
Scheffler, Tatjana [1 ]
Stede, Manfred [1 ]
机构
[1] Univ Potsdam, Res Focus Cognit Sci, SFB1287, Potsdam, Germany
来源
TEXT, SPEECH, AND DIALOGUE (TSD 2019) | 2019年 / 11697卷
关键词
Ontonotes; Coreference; Genre; Spoken; Written;
D O I
10.1007/978-3-030-27947-9_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The OntoNotes corpus is widely used for training and testing coreference resolution systems, but only little attention has so far been given to the differences between the different genres of language that the corpus is composed of. We are primarily interested in the contrast between spoken and written language, and thus we conducted in-depth analyses of various reference-related properties of the sub-corpora of OntoNotes, which yield several statistically significant differences. We compare these to predictions made in the Linguistics literature, and draw some conclusions for potential genre-specific implementations of coreference resolution.
引用
收藏
页码:171 / 184
页数:14
相关论文
共 22 条
[1]  
Aktas B., 2018, Proceedings of the First Workshop on Computational Models of Reference, Anaphora and Coreference, P1
[2]  
Amoia M., 2012, P 8 INT C LANG RES E
[3]  
[Anonymous], 2017, P WORKSH NLP OP SOUR
[4]  
[Anonymous], 2007, BBN TECHNOLOGIES CO
[5]  
[Anonymous], 2018, P 1 WORKSH COMP MOD
[6]  
BIBER D, 1992, TREND LIN S, V65, P213
[7]  
Biber D., 1999, Longman Grammar of Spoken and Written English, DOI DOI 10.1177/0075424202250290
[8]  
Clark K, 2015, PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, P1405
[9]  
Durrett G, 2013, P 2013 C EMP METH NA
[10]  
Engell S., 2016, C ENGL GERM THEOR FR