A benchmark dataset and evaluation methodology for Chinese zero pronoun translation

被引:1
|
作者
Xu, Mingzhou [1 ]
Wang, Longyue [2 ]
Liu, Siyou [3 ]
Wong, Derek F. [1 ]
Shi, Shuming [2 ]
Tu, Zhaopeng [2 ]
机构
[1] Univ Macau, Dept Comp & Informat Sci, Taipa, Macao, Peoples R China
[2] Tencent, AI Lab, Shenzhen, Peoples R China
[3] Macao Polytech Inst, Sch Languages & Translat, Taipa, Macao, Peoples R China
关键词
Zero pronoun; Machine translation; Benchmark dataset; Evaluation metric; Discourse;
D O I
10.1007/s10579-023-09660-5
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The phenomenon of zero pronoun (ZP) has attracted increasing interest in the machine translation community due to its importance and difficulty. However, previous studies generally evaluate the quality of translating ZPs with BLEU score on MT testsets, which is not expressive or sensitive enough for accurate assessment. To bridge the data and evaluation gaps, we propose a benchmark testset and evaluation metric for target evaluation on Chinese ZP translation. The human-annotated testset covers five challenging genres, which reveal different characteristics of ZPs for comprehensive evaluation. We systematically revisit advanced models on ZP translation and identify current challenges for future exploration. We release data, code, and trained models, which we hope can significantly promote research in this field.
引用
收藏
页码:1263 / 1293
页数:31
相关论文
共 24 条
  • [1] A benchmark dataset and evaluation methodology for Chinese zero pronoun translation
    Mingzhou Xu
    Longyue Wang
    Siyou Liu
    Derek F. Wong
    Shuming Shi
    Zhaopeng Tu
    Language Resources and Evaluation, 2023, 57 : 1263 - 1293
  • [2] Evaluation Dataset for Zero Pronoun in Japanese to English Translation
    Shimazu, Sho
    Takase, Sho
    Nakazawa, Toshiaki
    Okazaki, Naoaki
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3630 - 3634
  • [3] Decoding the silence: Neural bases of zero pronoun resolution in Chinese
    Zhang, Shulin
    Li, Jixing
    Yang, Yiming
    Hale, John
    BRAIN AND LANGUAGE, 2022, 224
  • [4] Zero Pronoun Identification in Chinese Language with Deep Neural Networks
    Chang, Tao
    Lv, Shaohe
    Wang, Xiaodong
    Wang, Dong
    PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ARTIFICIAL INTELLIGENCE (CAAI 2017), 2017, 134 : 518 - 522
  • [5] A Benchmark Dataset for Multi-Level Complexity-Controllable Machine Translation
    Tani, Kazuki
    Yuasa, Ryoya
    Takikawa, Kazuki
    Tamura, Akihiro
    Kajiwara, Tomoyuki
    Ninomiya, Takashi
    Kato, Tsuneo
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6744 - 6752
  • [6] FGraDA: A Dataset and Benchmark for Fine-Grained Domain Adaptation in Machine Translation
    Zhu, Wenhao
    Huang, Shujian
    Pu, Tong
    Huang, Pingxuan
    Zhang, Xu
    Yu, Jian
    Chen, Wei
    Wang, Yanfeng
    Chen, Jiajun
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6719 - 6727
  • [7] ChinFood1000: A Large Benchmark Dataset for Chinese Food Recognition
    Fu, Zhihui
    Chen, Dan
    Li, Hongyu
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2017, PT I, 2017, 10361 : 273 - 281
  • [8] Underwater Image Enhancement Quality Evaluation: Benchmark Dataset and Objective Metric
    Jiang, Qiuping
    Gu, Yuese
    Li, Chongyi
    Cong, Runmin
    Shao, Feng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (09) : 5959 - 5974
  • [9] PatternNet: A benchmark dataset for performance evaluation of remote sensing image retrieval
    Zhou, Weixun
    Newsam, Shawn
    Li, Congmin
    Shao, Zhenfeng
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 145 : 197 - 209
  • [10] DiTing: A large-scale Chinese seismic benchmark dataset for artificial intelligence in seismology
    Zhao, Ming
    Xiao, Zhuowei
    Chen, Shi
    Fang, Lihua
    EARTHQUAKE SCIENCE, 2023, 36 (02) : 84 - 94