South-East European Times: A parallel corpus of Balkan languages

被引:0
作者
Tyers, Francis M. [1 ]
Alperen, Murat Serdar
机构
[1] Univ Alacant, Dept Lleng & Sist Informat, E-03071 Alacant, Spain
来源
LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2010年
关键词
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
This paper describes the creation of a parallel corpus from a multilingual news website translated into eight languages of the Balkans (Albanian, Bulgarian, Croatian, Greek, Macedonian, Romanian, Serbian, and Turkish) and English. The corpus is then applied to the task of machine translation, creating 72 machine translation systems. The performance of these systems is then evaluated and thought is given to where future work might be focussed.
引用
收藏
页码:H49 / H53
页数:5
相关论文
共 15 条
  • [1] [Anonymous], 2005, P INT C REC ADV NAT
  • [2] [Anonymous], 2007, EMNLP-CoNLL
  • [3] [Anonymous], 2000, Languages in Contact (Studies in Slavic and General Linguistics 28)
  • [4] [Anonymous], 2006, P 5 INT C LANGUAGE R
  • [5] Callison-Burch C, 2009, Proceedings of the Fourth Workshop on Statistical Machine Translation, P1
  • [6] Callison-Burch Chris., 2006, P EACL 2006
  • [7] Federico M, 2008, INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, P1618
  • [8] Homola Petr, 2004, P C EUR ASS MACH TRA
  • [9] Koehn P, 2007, ACL 2007 DEM SESS
  • [10] Koehn P., 2005, MT SUMMIT 2005