Regression for machine translation evaluation at the sentence level

被引:10
作者
Albrecht, Joshua S. [1 ]
Hwa, Rebecca [1 ]
机构
[1] Univ Pittsburgh, Dept Comp Sci, Pittsburgh, PA 15260 USA
关键词
Machine translation; Evaluation metrics; Machine learning;
D O I
10.1007/s10590-008-9046-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning offers a systematic framework for developing metrics that use multiple criteria to assess the quality of machine translation (MT). However, learning introduces additional complexities that may impact on the resulting metric's effectiveness. First, a learned metric is more reliable for translations that are similar to its training examples; this calls into question whether it is as effective in evaluating translations from systems that are not its contemporaries. Second, metrics trained from different sets of training examples may exhibit variations in their evaluations. Third, expensive developmental resources (such as translations that have been evaluated by humans) may be needed as training examples. This paper investigates these concerns in the context of using regression to developmetrics for evaluating machine-translated sentences. We track a learned metric's reliability across a 5 year period to measure the extent to which the learned metric can evaluate sentences produced by other systems. We compare metrics trained under different conditions to measure their variations. Finally, we present an alternative formulation of metric training in which the features are based on comparisons against pseudo-references in order to reduce the demand on human produced resources. Our results confirm that regression is a useful approach for developing new metrics for MT evaluation at the sentence level.
引用
收藏
页码:1 / 27
页数:27
相关论文
共 50 条
[31]   A review of machine transliteration, translation, evaluation metrics and datasets in Indian Languages [J].
Jha, Abhinav ;
Patil, Hemprasad Yashwant .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (15) :23509-23540
[32]   A review of machine transliteration, translation, evaluation metrics and datasets in Indian Languages [J].
Abhinav Jha ;
Hemprasad Yashwant Patil .
Multimedia Tools and Applications, 2023, 82 :23509-23540
[33]   Involving language professionals in the evaluation of machine translation [J].
Maja Popović ;
Eleftherios Avramidis ;
Aljoscha Burchardt ;
Sabine Hunsicker ;
Sven Schmeier ;
Cindy Tscherwinka ;
David Vilar ;
Hans Uszkoreit .
Language Resources and Evaluation, 2014, 48 :541-559
[34]   CCMT 2019 Machine Translation Evaluation Report [J].
Yang, Muyun ;
Hu, Xixin ;
Xiong, Hao ;
Wang, Jiayi ;
Jiaermuhamaiti, Yiliyaer ;
He, Zhongjun ;
Luo, Weihua ;
Huang, Shujian .
MACHINE TRANSLATION, CCMT 2019, 2019, 1104 :105-128
[35]   The METEOR metric for automatic evaluation of machine translation [J].
Lavie, Alon ;
Denkowski, Michael J. .
MACHINE TRANSLATION, 2009, 23 (2-3) :105-115
[36]   A preliminary evaluation of, metadata records machine translation [J].
Chen, Jiangping ;
Ding, Ren ;
Jiang, Shan ;
Knudson, Ryan .
ELECTRONIC LIBRARY, 2012, 30 (02) :264-277
[37]   An automatic evaluation of machine translation and Slavic languages [J].
Munkova, Dasa ;
Munk, Michal .
2014 IEEE 8TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2014, :447-451
[38]   Involving language professionals in the evaluation of machine translation [J].
Popovic, Maja ;
Avramidis, Eleftherios ;
Burchardt, Aljoscha ;
Hunsicker, Sabine ;
Schmeier, Sven ;
Tscherwinka, Cindy ;
Vilar, David ;
Uszkoreit, Hans .
LANGUAGE RESOURCES AND EVALUATION, 2014, 48 (04) :541-559
[39]   Linguistic measures for automatic machine translation evaluation [J].
Giménez J. ;
Màrquez L. .
Machine Translation, 2010, 24 (3-4) :209-240
[40]   Using Contextual Information for Machine Translation Evaluation [J].
Fomicheva, Marina ;
Bel, Nuria .
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, :2755-2761