Multi-view fusion for universal translation quality estimation

被引：2

作者：

Huang, Hui ^{[1
]}

Wu, Shuangzhi ^{[2
]}

Chen, Kehai ^{[3
]}

Liang, Xinnian ^{[4
]}

Di, Hui ^{[5
]}

Yang, Muyun ^{[1
]}

Zhao, Tiejun ^{[1
]}

机构：

[1] Harbin Inst Technol, Fac Comp, Harbin, Peoples R China

[2] ByteDance AI Lab, Beijing, Peoples R China

[3] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen, Peoples R China

[4] Beihang Univ, State Key Lab Software Dev Environm, Beijing, Peoples R China

[5] Toshiba Co Ltd, Res & Dev Ctr, Beijing, Peoples R China

来源：

INFORMATION FUSION | 2024年 / 102卷

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Translation quality estimation; Machine translation; Pre-trained model; Large language model;

D O I：

10.1016/j.inffus.2023.102022

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Machine translation quality estimation (QE) aims to evaluate the result of translation without reference. Despite the progress it has made, state-of-the-art QE models are proven to be biased. More specifically, they over-rely on spurious statistical features while ignoring the bilingual semantic adequacy, leading to performance degradation. Besides, existing approaches require large amounts of annotation data, restricting their applications in new domains and languages. In this work, we propose a universal framework for quality estimation based on multi-view fusion. We first introduce noise to the target side of the parallel sentence pair, either by pre-trained language model or by large language model. After that, with the clean parallel pairs and the noised pairs as different views, the QE model is trained to distinguish the clean pairs from the noised ones. Our method can improve the accuracy and generalizability in supervised scenario, and can solely perform estimation in zero-shot scenario. We perform experiments on WMT QE evaluation datasets under different scenarios, verifying the effectiveness of our method. We also make an in-depth investigation of the bias of QE model.

引用

页数：9

共 40 条

[1]

Brown TB, 2020, Arxiv, DOI [arXiv:2005.14165, DOI 10.48550/ARXIV.2005.14165]

[2]

Bang Y, 2023, Arxiv, DOI [arXiv:2302.04023, DOI 10.48550/ARXIV.2302.04023]

[3]

Behnke H, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), P1475

[4]

Blatz John, 2004, COLING 2004 P 20 INT, P315

[5]

Burchardt A, 2013, P TRANSL COMP ASL LO, V35

[6]

Conneau Alexis, 2020, P 58 ANN M ASS COMP, P8440, DOI [10.18653/v1/2020.acl-main.747, DOI 10.18653/V1/2020.ACL-MAIN.747]

[7]

Cui Q, 2021, AAAI CONF ARTIF INTE, V35, P12719

[8]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[9]

Fomicheva M, 2020, arXiv

[10] Unsupervised Quality Estimation for Neural Machine Translation [J].

Fomicheva, Marina ;

Sun, Shuo ;

Yankovskaya, Lisa ;

Blain, Frederic ;

Guzman, Francisco ;

Fishel, Mark ;

Aletras, Nikolaos ;

Chaudhary, Vishrav ;

Specia, Lucia .

TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 :539-555

← 1 2 3 4 →