Orwell's 1984-From Simple to Multi-word Units

被引:1
作者
Krstev, Cvetana [1 ]
Vitas, Dusko [2 ]
Trtovac, Aleksandra [3 ]
机构
[1] Univ Belgrade, Fac Philol, Studentski Trg 1, Belgrade, Serbia
[2] Univ Belgrade, Fac Math, YU-11001 Belgrade, Serbia
[3] Univ Belgrade, Univ Library, Belgrade, Serbia
来源
HUMAN LANGUAGE TECHNOLOGY CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS | 2014年 / 8387卷
关键词
Morphosyntactic annotation; Multi-word units; Finite-state transducers; MULTEXT-East;
D O I
10.1007/978-3-319-08958-4_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present an alternative version of the morphosyntactically annotated Serbian translation of 1984. This version follows the basic principles of the MULTEXT-East version, except for one addition-the text will be annotated with multi-word units as well. We will present the resources used for annotation with multi-word units and explain how these resources were enriched with multi-word units extracted from the processed text. Finally, we will present the format of this alternative version and the benefits obtained both from preparing the new resource and from the resource itself.
引用
收藏
页码:276 / 287
页数:12
相关论文
共 27 条
  • [1] Alegria Inaki., 2004, ACL WORKSHOP MULTIWO, P48
  • [2] [Anonymous], 2011, P 5 LING ANN WORKSH
  • [3] Bozovic M, 2010, THESIS
  • [4] Courtois B, 1990, DICT ELECT FRANCAIS
  • [5] Delic V, 2009, PROCEEDINGS OF THE 8TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, MAN-MACHINE SYSTEMS AND CYBERNETICS (CIMMACS '09), P98
  • [6] Dimitrova Ludmila., 1998, Proceedings of the Thirty-Sixth Annual Meeting of the Association for Computational Linguistics and Seventeenth International Conference on Computational Linguistics, P315
  • [7] Dordevie B., 2012, P WORKSH COMP LING N, P89
  • [8] Erjavec T., 1998, E MEETS W COMPENDIUM
  • [9] MULTEXT-East: morphosyntactic resources for Central and Eastern European languages
    Erjavec, Tomaz
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2012, 46 (01) : 131 - 142
  • [10] Erjavec Tomaz., 2004, Fourth International Conference on Language Resources and Evaluation, V4, P1535