Unsupervised Sub-tree Alignment for Tree-to-Tree Translation

被引:7
|
作者
Xiao, Tong [1 ]
Zhu, Jingbo [1 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang, Peoples R China
基金
中国博士后科学基金; 美国国家科学基金会;
关键词
D O I
10.1613/jair.4033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents a probabilistic sub-tree alignment model and its application to tree-to-tree machine translation. Unlike previous work, we do not resort to surface heuristics or expensive annotated data, but instead derive an unsupervised model to infer the syntactic correspondence between two languages. More importantly, the developed model is syntactically-motivated and does not rely on word alignments. As a by-product, our model outputs a sub-tree alignment matrix encoding a large number of diverse alignments between syntactic structures, from which machine translation systems can efficiently extract translation rules that are often filtered out due to the errors in 1-best alignment. Experimental results show that the proposed approach outperforms three state-of-the-art baseline approaches in both alignment accuracy and grammar quality. When applied to machine translation, our approach yields a +1.0 BLEU improvement and a -0.9 TER reduction on the NIST machine translation evaluation corpora. With tree binarization and fuzzy decoding, it even outperforms a state-of-the-art hierarchical phrase-based system.
引用
收藏
页码:733 / 782
页数:50
相关论文
共 50 条
  • [2] Tree-to-tree Neural Networks for Program Translation
    Chen, Xinyun
    Liu, Chang
    Song, Dawn
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [3] A Chinese-Naxi Tree-to-Tree Machine Translation Method Based on Subtree Alignment
    Gao, Shengxiang
    Yu, Zhengtao
    Liu, Chao
    Chen, Lin
    Hong, Xudong
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [4] Exploring Syntactic Structural Features for Sub-Tree Alignment using Bilingual Tree Kernels
    Sun, Jun
    Zhang, Min
    Tan, Chew Lim
    ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 306 - 315
  • [5] Compositions of Tree-to-Tree Statistical Machine Translation Models
    Maletti, Andreas
    INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2018, 29 (05) : 877 - 892
  • [6] Compositions of Tree-to-Tree Statistical Machine Translation Models
    Maletti, Andreas
    DEVELOPMENTS IN LANGUAGE THEORY, DLT 2016, 2016, 9840 : 293 - 305
  • [7] TREE-TO-TREE CORRECTION PROBLEM
    TAI, KC
    JOURNAL OF THE ACM, 1979, 26 (03) : 422 - 433
  • [8] TREE-TO-TREE EDITING PROBLEM
    SELKOW, SM
    INFORMATION PROCESSING LETTERS, 1977, 6 (06) : 184 - 186
  • [9] Interpreting tree-to-tree queries
    Benedikt, Michael
    Koch, Christoph
    AUTOMATA, LANGAGES AND PROGRAMMING, PT 2, 2006, 4052 : 552 - 564
  • [10] Lingua-Align: An Experimental Toolbox for Automatic Tree-to-Tree Alignment
    Tiedemann, Joerg
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010,