Error-Annotated Corpus of Latvian

被引:7
作者
Deksne, Daiga [1 ]
Skadina, Inguna [1 ]
机构
[1] Tilde SIA, Riga, Latvia
来源
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2014 | 2014年 / 268卷
关键词
Error classification; corpus annotation; error annotated corpus; grammar checking; Latvian language;
D O I
10.3233/978-1-61499-442-8-163
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper reports on the development of the annotated Latvian language error corpus designed for grammar checker development and evaluation. We describe the error classification system introduced for this purpose, the annotation process, and guidelines. Two corpora (the corpus of student papers and the balanced text corpus) consisting of a total of 20,877 sentences have been created and annotated. A general characterisation of the corpora and a summary of the annotation results are presented.
引用
收藏
页码:163 / 166
页数:4
相关论文
共 7 条
  • [1] [Anonymous], 2013, LATV VAL GRAM
  • [2] [Anonymous], 2003, P CORPUS LINGUISTICS
  • [3] Becker M., 2003, TREEBANKS BUILDING U, V20, P89
  • [4] Dahlmeier D., 2013, 8 WORKSH INN US NLP, P22
  • [5] Deksne D, 2014, LECT NOTES COMPUT SC, V8403, P237, DOI 10.1007/978-3-642-54906-9_19
  • [6] Freimane I., 1993, VAL KULT TEOR SKAT
  • [7] Granger S., 1993, The European English Messenger, V2, P34