Orwell's 1984-From Simple to Multi-word Units

被引:1
作者
Krstev, Cvetana [1 ]
Vitas, Dusko [2 ]
Trtovac, Aleksandra [3 ]
机构
[1] Univ Belgrade, Fac Philol, Studentski Trg 1, Belgrade, Serbia
[2] Univ Belgrade, Fac Math, YU-11001 Belgrade, Serbia
[3] Univ Belgrade, Univ Library, Belgrade, Serbia
来源
HUMAN LANGUAGE TECHNOLOGY CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS | 2014年 / 8387卷
关键词
Morphosyntactic annotation; Multi-word units; Finite-state transducers; MULTEXT-East;
D O I
10.1007/978-3-319-08958-4_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present an alternative version of the morphosyntactically annotated Serbian translation of 1984. This version follows the basic principles of the MULTEXT-East version, except for one addition-the text will be annotated with multi-word units as well. We will present the resources used for annotation with multi-word units and explain how these resources were enriched with multi-word units extracted from the processed text. Finally, we will present the format of this alternative version and the benefits obtained both from preparing the new resource and from the resource itself.
引用
收藏
页码:276 / 287
页数:12
相关论文
共 27 条
  • [21] Przepiorkowski A., 2003, P 4 INT WORKSH LING, P13
  • [22] Savary A., 2010, P 7 INT C LANG RES E, P3622
  • [23] Savary A, 2000, THESIS
  • [24] Savary A, 2009, LECT NOTES COMPUT SC, V5642, P237, DOI 10.1007/978-3-642-02979-0_27
  • [25] Utvic M., 2010, P 29 INT C LEX GRAMM, P333
  • [26] Utvic M., 2011, INFOtheca, V12, p36a
  • [27] Wozbniak M, 2011, P 5 LANG TECHN C HUM, P187