Word Closure-Based Metamorphic Testing for Machine Translation

被引：0

作者：

Xie, Xiaoyuan ^{[1
]}

Jin, Shuo ^{[1
]}

Chen, Songqiang ^{[2
]}

Cheung, Shing-chi ^{[2
]}

机构：

[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China

[2] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China

来源：

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY | 2024年 / 33卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Machine translation; metamorphic testing; word closure; deep learning testing;

D O I：

10.1145/3675396

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

With the wide application of machine translation, the testing of Machine Translation Systems (MTSs) has attracted much attention. Recent works apply Metamorphic Testing (MT) to address the oracle problem in MTS testing. Existing MT methods for MTS generally follow the workflow of input transformation and output relation comparison, which generates a follow-up input sentence by mutating the source input and compares the source and follow-up output translations to detect translation errors, respectively. These methods use various input transformations to generate the test case pairs and have successfully triggered numerous translation errors. However, they have limitations in performing fine-grained and rigorous output relation comparison and thus may report many false alarms and miss many true errors. In this article, we propose a word closure-based output comparison method to address the limitations of the existing MTS MT methods. We first propose word closure as a new comparison unit, where each closure includes a group of correlated input and output words in the test case pair. Word closures suggest the linkages between the appropriate fragment in the source output translation and its counterpart in the follow-up output for comparison. Next, we compare the semantics on the level of word closure to identify the translation errors. In this way, we perform a fine-grained and rigorous semantic comparison for the outputs and thus realize more effective violation identification. We evaluate our method with the test cases generated by five existing input transformations and the translation outputs from three popular MTSs. Results show that our method significantly outperforms the existing works in violation identification by improving the precision and recall and achieving an average increase of 29.9% in F1 score. It also helps to increase the F1 score of translation error localization by 35.9%.

引用

页数：46

共 50 条

[31] Machine translation and human translation of multi-word expressions: peeling this pineapple
Rebechi, Rozane Rodrigues
Marcon, Nathalia Oliva
Faller, Guilherme de Almeida
REVISTA VIRTUAL DE ESTUDOS DA LINGUAGEM-REVEL, 2025, 23 (44): : 346 - 380
[32] Metamorphic Testing For Machine Learning: Applicability, Challenges, and Research Opportunities
Rehman, Faqeer Ur
Srinivasan, Madhusudan
2023 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING, AITEST, 2023, : 34 - 39
[33] Machine Translation Testing via Pathological Invariance
Gupta, Shashij
He, Pinjia
Meister, Clara
Su, Zhendong
PROCEEDINGS OF THE 28TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '20), 2020, : 863 - 875
[34] Predictor-Estimator: Neural Quality Estimation Based on Target Word Prediction for Machine Translation
Kim, Hyun
Jung, Hun-Young
Kwon, Hongseok
Lee, Jong-Hyeok
Na, Seung-Hoon
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2017, 17 (01)
[35] Partial Least Squares for Word Confidence Estimation in Machine Translation
Gonzalez-Rubio, Jesus
Ramon Navarro-Cerdan, Jose
Casacuberta, Francisco
PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2013, 2013, 7887 : 500 - 508
[36] Lexicalized Syntactic Reordering Framework for Word Alignment and Machine Translation
Huang, Chung-chi
Chen, Wei-teh
Chang, Jason S.
COMPUTER PROCESSING OF ORIENTAL LANGUAGES: LANGUAGE TECHNOLOGY FOR THE KNOWLEDGE-BASED ECONOMY, 2009, 5459 : 103 - 111
[37] Bilingual Word Embedding with Sentence Similarity Constraint for Machine Translation
Wu, Kui
Wang, Xuancong
Aw, AiTi
2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 119 - 122
[38] Korean Neural Machine Translation Using Hierarchical Word Structure
Park, Jeonghyeok
Zhao, Hai
2020 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2020), 2020, : 294 - 298
[39] Application of Property-based Testing Tools for Metamorphic Testing
Alzahrani, Nasser
Spichkova, Maria
Harland, James
ENASE: PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING, 2022, : 553 - 560
[40] An Automatic Testing Approach for Compiler Based on Metamorphic Testing Technique
Tao, Qiuming
Wu, Wei
Zhao, Chen
Shen, Wuwei
17TH ASIA PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2010), 2010, : 270 - 279

← 1 2 3 4 5 →