Dependency Parsing of Estonian: Statistical and Rule-based Approaches

被引:3
作者
Muischnek, Kadri [1 ]
Mueuerisep, Kaili [1 ]
Puolakainen, Tiina [1 ]
机构
[1] Univ Tartu, Tartu, Estonia
来源
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2014 | 2014年 / 268卷
关键词
Estonian syntax; dependency parsing; treebank;
D O I
10.3233/978-1-61499-442-8-111
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper gives an overview of the latest developments in computational syntactic analysis of Estonian. We present Estonian Dependency Treebank, an ongoing corpus annotation project. Although the treebank construction is still under way, we have used it for training MaltParser and experimenting with combining MaltParser with a rule-based Constraint Grammar parser for Estonian. MaltParser achieves unlabeled attachment score (UAS; correct links to head node) of 83.4% and label accuracy (LA) of 88.6%. Labeled attachment score (LAS) was 80.3%. Applying different algorithms for combining MaltParser with Constraint Grammar parser improved the results by 1%. Special CG rule set for fixing some typical MaltParser errors improved the UAS by up to 1.5%.
引用
收藏
页码:111 / +
页数:2
相关论文
共 14 条
[1]  
[Anonymous], 2008, TECHNICAL REPORT
[2]  
[Anonymous], 2013, P 4 WORKSH STAT PARS
[3]  
Ballesteros M, 2012, LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P2757
[4]  
Bick E., 2000, PALAVRAS AUTOMATIC G
[5]  
Hakulinen Auli., 2004, Iso suomen kielioppi. [Large Finnish grammar]
[6]  
Haverinen K., 2013, J LANGUAGE RESOURCES
[7]  
Karlsson F., 1995, CONSTRAINT GRAMMAR L
[8]  
McDonald Ryan, 2013, P 51 ANN M ASS COMP, V2, P92
[9]  
Mu<spacing diaeresis>urisep K., 2003, P C REC ADV NAT LANG, P304
[10]  
Muischnek Kadri, 2013, HUMAN LANGUAGE TECHN, P338