A Monte Carlo Method for Metamorphic Testing of Machine Translation Services

被引:24
作者
Pesu, Daniel [1 ]
Zhou, Zhi Quan [2 ]
Zhen, Jingfeng [1 ]
Towey, Dave [3 ]
机构
[1] Univ Wollongong, Sch Comp & Informat Technol, Wollongong, NSW 2522, Australia
[2] Univ Wollongong, Sch Comp & Informat Technol, Inst Cybersecur & Cryptol, Wollongong, NSW 2522, Australia
[3] Univ Nottingham Ningbo China, Sch Comp Sci, Ningbo 315100, Zhejiang, Peoples R China
来源
2018 IEEE/ACM 3RD INTERNATIONAL WORKSHOP ON METAMORPHIC TESTING (MET 2018) | 2018年
基金
澳大利亚研究理事会;
关键词
Machine translation quality; oracle problem; metamorphic testing; Monte Carlo method; natural languages;
D O I
10.1145/3193977.3193980
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the growing popularity of machine translation services, it has become increasingly important to be able to assess their quality. However, the test oracle problem makes it difficult to conduct automated testing. In this paper, we propose a Monte Carlo method, in combination with metamorphic testing, to overcome the oracle problem. Using this method, we assessed the quality of three popular machine translation services - namely, Google Translate, Microsoft Translator, and Youdao Translate. We set the source language to be English, and the target languages included Chinese, French, Japanese, Korean, Portuguese, Russian, Spanish, and Swedish. A sample of 33,600 observations (involving a total of 100,800 actual translations) was collected and analyzed using a 3 x 56 factorial design. Based on this data, our model found Google Translate to be the best (in terms of the metamorphic relation used) for each and every target language considered. A trend for Indo-European languages producing better results was also identified.
引用
收藏
页码:38 / 45
页数:8
相关论文
共 19 条
[1]  
Aiken Milam, 2010, TRANSL J, V14, P1
[2]  
[Anonymous], 2017, R LANG ENV STAT COMP
[3]  
[Anonymous], 1998, HKUSTCS9801
[4]  
[Anonymous], 2009, NATURAL LANGUAGE PRO, DOI DOI 10.1007/S10579-010-9124-X
[5]   The Oracle Problem in Software Testing: A Survey [J].
Barr, Earl T. ;
Harman, Mark ;
McMinn, Phil ;
Shahbaz, Muzammil ;
Yoo, Shin .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2015, 41 (05) :507-525
[6]   EVALUATION OF 10 PAIRWISE MULTIPLE COMPARISON PROCEDURES BY MONTE-CARLO METHODS [J].
CARMER, SG ;
SWANSON, MR .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1973, 68 (341) :66-74
[7]   Metamorphic Testing: A Review of Challenges and Opportunities [J].
Chen, Tsong Yueh ;
Kuo, Fei-Ching ;
Liu, Huai ;
Poon, Pak-Lok ;
Towey, Dave ;
Tse, T. H. ;
Zhou, Zhi Quan .
ACM COMPUTING SURVEYS, 2018, 51 (01)
[8]   Fault-based testing without the need of oracles [J].
Chen, TY ;
Tse, TH ;
Zhou, ZQ .
INFORMATION AND SOFTWARE TECHNOLOGY, 2003, 45 (01) :1-9
[9]  
Cox D. R., 2000, The theory of design of experiments, V1st
[10]  
Huang A., 2008, NZCSRSC 2008