Integrating Pronunciation into Chinese-Vietnamese Statistical Machine Translation

被引:0
作者
Anh Tran Huu [1 ]
Heyan Huang [1 ]
Yuhang Guo [1 ]
Shumin Shi [1 ]
Ping Jian [1 ]
机构
[1] the Department of Computer Science and Technology, Beijing Institute of Technology
基金
中国国家自然科学基金;
关键词
pronunciation integration; low-resource languages; Chinese-Vietnamese machine translation; Sino-Vietnamese words;
D O I
暂无
中图分类号
H085 [机器翻译];
学科分类号
050211 ;
摘要
Statistical machine translation for low-resource language suffers from the lack of abundant training corpora. Several methods, such as the use of a pivot language, have been proposed as a bridge to translate from one language to another. However, errors will accumulate during the extensive translation pipelines. In this paper,we propose an approach to low-resource language translation by exploiting the pronunciation correlations between languages. We find that the pronunciation features can improve both Chinese-Vietnamese and VietnameseChinese translation qualities. Experimental results show that our proposed model yields effective improvements,and the translation performance(bilingual evaluation understudy score) is improved by a maximum value of 1.03.
引用
收藏
页码:715 / 723
页数:9
相关论文
共 13 条
  • [1] Srilm - an extensible language modeling toolkit. A. Stolcke. Proc of the 7th Int. Conf. on Spoken Language Processing. ICSLP‘ 02 . 2002
  • [2] Improving pivot-based statistical machine translation using random walk. X.N Zhu,Z.J.He,H.Wu,H.F.Wang,C.H.Zhu,T.J.Zhao. Proc 2013 Conf.on Empirical Methods in Natural Language Processing . 2013
  • [3] Vietnamese to Chinese machine translation via Chinese character as pivot. H.Zhao,T.J.Yin,J Y.Zhang. Proc 27th Pacific Asia Conf on Language Information,and Computation (PACLIC 27) . 2013
  • [4] A quantitative and typological approach to correlating linguistic complexity. Y.M Oh,F.Pellegrino,E.Marsico,C.Coup′e. Proc 5th Conf.on Quantitative Investigations in Theoretical Linguistics . 2013
  • [5] Research on the effects of Sino-Vietnamese pronunciation in helping Vietnamese students study mandarin Chinese. N.D.Vinh. . 2015
  • [6] Factored translation models. P.Koehn,H.Hoang. Proc 2007 Joint Conf.on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) . 2007
  • [7] CCG supertags in factored statistical machine translation. A.Birch,M.Osborne,P.Koehn. Proc Second Workshop on Statistical Machine Translation (ACL) . 2007
  • [8] A systematic comparison of various statistical alignment models
    Och, FJ
    Ney, H
    [J]. COMPUTATIONAL LINGUISTICS, 2003, 29 (01) : c - 51
  • [9] B1EU: a method for automatic evaluation of machine translation. Kishore Papineni,Salim Roukos,Todd Ward,Wei-Jing Zhu. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics . 2002
  • [10] Linguisticallyaugmented Bulgarian-to-English statistical machine translation model. R.Wang,P.Osenova,K.Simov. Proc Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT)and Hybrid Approaches to Machine Translation (HyTra) . 2012