A benchmark dataset and evaluation methodology for Chinese zero pronoun translation

被引:1
|
作者
Xu, Mingzhou [1 ]
Wang, Longyue [2 ]
Liu, Siyou [3 ]
Wong, Derek F. [1 ]
Shi, Shuming [2 ]
Tu, Zhaopeng [2 ]
机构
[1] Univ Macau, Dept Comp & Informat Sci, Taipa, Macao, Peoples R China
[2] Tencent, AI Lab, Shenzhen, Peoples R China
[3] Macao Polytech Inst, Sch Languages & Translat, Taipa, Macao, Peoples R China
关键词
Zero pronoun; Machine translation; Benchmark dataset; Evaluation metric; Discourse;
D O I
10.1007/s10579-023-09660-5
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The phenomenon of zero pronoun (ZP) has attracted increasing interest in the machine translation community due to its importance and difficulty. However, previous studies generally evaluate the quality of translating ZPs with BLEU score on MT testsets, which is not expressive or sensitive enough for accurate assessment. To bridge the data and evaluation gaps, we propose a benchmark testset and evaluation metric for target evaluation on Chinese ZP translation. The human-annotated testset covers five challenging genres, which reveal different characteristics of ZPs for comprehensive evaluation. We systematically revisit advanced models on ZP translation and identify current challenges for future exploration. We release data, code, and trained models, which we hope can significantly promote research in this field.
引用
收藏
页码:1263 / 1293
页数:31
相关论文
共 24 条
  • [21] The Spoken Language Understanding MEDIA Benchmark Dataset in the Era of Deep Learning: data updates, training and evaluation tools
    Laperriere, Gaelle
    Pelloin, Valentin
    Caubriere, Antoine
    Mdhaffar, Salima
    Camelin, Nathalie
    Ghannay, Sahar
    Jabaian, Bassam
    Esteve, Yannick
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1595 - 1602
  • [22] An English-Chinese Machine Translation and Evaluation Method for Geographical Names
    Ren, Hongkai
    Mao, Xi
    Ma, Weijun
    Wang, Jizhou
    Wang, Linyun
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2020, 9 (03)
  • [23] Evaluation of end-to-end aspect-based sentiment analysis methods employing novel benchmark dataset for aspect, and opinion review analysis
    Pecar, Samuel
    Daudert, Tobias
    Simko, Marian
    INTELLIGENT DATA ANALYSIS, 2022, 26 (06) : 1617 - 1641
  • [24] Evaluation methodology and metrics employed to assess the TRANSTAC two-way, speech-to-speech translation systems
    Sanders, Gregory A.
    Weiss, Brian A.
    Schlenoff, Craig
    Steves, Michelle P.
    Condon, Sherri
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (02): : 528 - 553