Lemmatising Verbs in Middle English Corpora: The Benefit of Enriching the Penn-Helsinki Parsed Corpus of Middle English 2 (PPCME2), the Parsed Corpus of Middle English Poetry (PCMEP), and A Parsed Linguistic Atlas of Early Middle English (PLAEME)

被引:0
作者
Percillier, Michael [1 ]
Trips, Carola [1 ]
机构
[1] Univ Mannheim, B6,30-32, D-68159 Mannheim, Germany
来源
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020) | 2020年
关键词
Lemmatisation; Middle English; verb argument structure;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper describes the lemmatisation of three annotated corpora of Middle English - the Penn-Helsinki Parsed Corpus of Middle English 2 (PPCME2), the Parsed Corpus of Middle English Poetry (PCMEP), and A Parsed Linguistic Atlas of Early Middle English (PLAEME) - which is a prerequisite for systematically investigating the argument structures of verbs of the given time. Creating this tool and enriching existing parsed corpora of Middle English is part of the project Borrowing of Argument Structure in Contact Situations (BASICS) which seeks to explain to which extent verbs copied from Old French had an impact on the grammar of Middle English. First, we lemmatised the PPCME2 by (1) creating an inventory of form-lemma correspondences linking forms in the PPCME2 to lemmas in the MED, and (2) inserting this lemma information into the corpus (precision: 94.85%, recall: 98.92%, accuracy: 94%). Second, we enriched the PCMEP and PLAEME, which adopted the annotation format of the PPCME2, with verb lemmas to undertake studies that fill the well-known data gap in the subperiod (1250-1350) of the PPCME2. The case study of reflexives shows that with our method we gain much more reliable results in terms of diachrony, diatopy, and contact-induced change.
引用
收藏
页码:7170 / 7178
页数:9
相关论文
共 23 条
[1]  
[Anonymous], 2004, STUDIA TYPOLOGICA
[2]  
[Anonymous], 2000, LINGUISTIK AKTUELL
[3]  
Einenkel E., 1916, GESCH ENGLISCHEN SPR, VII
[4]  
Johanson L., 2002, Language Change: The Interplay of Internal, External and Extra-Linguistic Factors, V86, P285
[5]  
Keenan E., 2009, Historical Syntax and Linguistic Theory, P17
[6]  
Kemmer S., 1993, TYPOLOGICAL STUDIES
[7]  
Konig Ekkehard., 2000, Diachronica, V17, P39
[8]  
Kroch A., 2000, The Penn-Helsinki Parsed Corpus of Middle English
[9]  
Laing Margaret., 2013, A Linguistic Atlas of Early Middle English, 1150-1325
[10]  
Mustanoja T. F., 1960, MEMOIRES SOC NEOPHIL