Word Closure-Based Metamorphic Testing for Machine Translation

被引:0
|
作者
Xie, Xiaoyuan [1 ]
Jin, Shuo [1 ]
Chen, Songqiang [2 ]
Cheung, Shing-chi [2 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine translation; metamorphic testing; word closure; deep learning testing;
D O I
10.1145/3675396
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
With the wide application of machine translation, the testing of Machine Translation Systems (MTSs) has attracted much attention. Recent works apply Metamorphic Testing (MT) to address the oracle problem in MTS testing. Existing MT methods for MTS generally follow the workflow of input transformation and output relation comparison, which generates a follow-up input sentence by mutating the source input and compares the source and follow-up output translations to detect translation errors, respectively. These methods use various input transformations to generate the test case pairs and have successfully triggered numerous translation errors. However, they have limitations in performing fine-grained and rigorous output relation comparison and thus may report many false alarms and miss many true errors. In this article, we propose a word closure-based output comparison method to address the limitations of the existing MTS MT methods. We first propose word closure as a new comparison unit, where each closure includes a group of correlated input and output words in the test case pair. Word closures suggest the linkages between the appropriate fragment in the source output translation and its counterpart in the follow-up output for comparison. Next, we compare the semantics on the level of word closure to identify the translation errors. In this way, we perform a fine-grained and rigorous semantic comparison for the outputs and thus realize more effective violation identification. We evaluate our method with the test cases generated by five existing input transformations and the translation outputs from three popular MTSs. Results show that our method significantly outperforms the existing works in violation identification by improving the precision and recall and achieving an average increase of 29.9% in F1 score. It also helps to increase the F1 score of translation error localization by 35.9%.
引用
收藏
页数:46
相关论文
共 50 条
  • [31] Machine translation and human translation of multi-word expressions: peeling this pineapple
    Rebechi, Rozane Rodrigues
    Marcon, Nathalia Oliva
    Faller, Guilherme de Almeida
    REVISTA VIRTUAL DE ESTUDOS DA LINGUAGEM-REVEL, 2025, 23 (44): : 346 - 380
  • [32] Metamorphic Testing For Machine Learning: Applicability, Challenges, and Research Opportunities
    Rehman, Faqeer Ur
    Srinivasan, Madhusudan
    2023 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING, AITEST, 2023, : 34 - 39
  • [33] Machine Translation Testing via Pathological Invariance
    Gupta, Shashij
    He, Pinjia
    Meister, Clara
    Su, Zhendong
    PROCEEDINGS OF THE 28TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '20), 2020, : 863 - 875
  • [34] Predictor-Estimator: Neural Quality Estimation Based on Target Word Prediction for Machine Translation
    Kim, Hyun
    Jung, Hun-Young
    Kwon, Hongseok
    Lee, Jong-Hyeok
    Na, Seung-Hoon
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2017, 17 (01)
  • [35] Partial Least Squares for Word Confidence Estimation in Machine Translation
    Gonzalez-Rubio, Jesus
    Ramon Navarro-Cerdan, Jose
    Casacuberta, Francisco
    PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2013, 2013, 7887 : 500 - 508
  • [36] Lexicalized Syntactic Reordering Framework for Word Alignment and Machine Translation
    Huang, Chung-chi
    Chen, Wei-teh
    Chang, Jason S.
    COMPUTER PROCESSING OF ORIENTAL LANGUAGES: LANGUAGE TECHNOLOGY FOR THE KNOWLEDGE-BASED ECONOMY, 2009, 5459 : 103 - 111
  • [37] Bilingual Word Embedding with Sentence Similarity Constraint for Machine Translation
    Wu, Kui
    Wang, Xuancong
    Aw, AiTi
    2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 119 - 122
  • [38] Korean Neural Machine Translation Using Hierarchical Word Structure
    Park, Jeonghyeok
    Zhao, Hai
    2020 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2020), 2020, : 294 - 298
  • [39] Application of Property-based Testing Tools for Metamorphic Testing
    Alzahrani, Nasser
    Spichkova, Maria
    Harland, James
    ENASE: PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING, 2022, : 553 - 560
  • [40] An Automatic Testing Approach for Compiler Based on Metamorphic Testing Technique
    Tao, Qiuming
    Wu, Wei
    Zhao, Chen
    Shen, Wuwei
    17TH ASIA PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2010), 2010, : 270 - 279