Using syntax for improving phrase-based SMT in low-resource languages

被引:2
|
作者
Fadaei, Hakimeh [1 ]
Faili, Heshaam [1 ,2 ]
机构
[1] Univ Tehran, Sch Elect & Comp Engn, Coll Engn, Tehran, Iran
[2] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran, Iran
基金
美国国家科学基金会;
关键词
MODEL;
D O I
10.1093/llc/fqz033
中图分类号
C [社会科学总论];
学科分类号
03 ; 0303 ;
摘要
Data driven approaches for machine translation, such as statistical and neural machine translation, suffer from sparsity when dealing with low-resource languages. In these cases, using other sources of information including linguistic information could alleviate the problem. In this article, we focus on the problem of word ordering in translation from a high-resource to a low-resource language and try to improve the quality by using syntactic information from the high-resource side. We propose some syntactic features based on Tree Adjoining Grammar (TAG) to be employed in a phrase-based SMT model in order to improve the word ordering. In this work, a set of synchronous TAG rules is extracted and used to estimate the probability of the phrase orders suggested by the phrase-based model. The main idea of the article is to handle the word ordering by using the extended domain of locality property of TAG and abstracting the long distance dependencies into a local view, which is a TAG elementary tree. The experiments on English-Persian and English-German translation showed that, by combining the proposed TAG-based reordering features with lexical and hierarchical reordering models, we gain significant improvements over the baseline and in comparison with a neural reordering model and a pre-reordering model.
引用
收藏
页码:507 / 528
页数:22
相关论文
共 50 条
  • [1] Neural machine translation of low-resource languages using SMT phrase pair injection
    Sen, Sukanta
    Hasanuzzaman, Mohammed
    Ekbal, Asif
    Bhattacharyya, Pushpak
    Way, Andy
    NATURAL LANGUAGE ENGINEERING, 2021, 27 (03) : 271 - 292
  • [2] Incorporating Syntax-Based Language Models in Phrase-Based SMT Models
    Chen, Yidong
    Shi, Xiaodong
    Zhou, Changle
    Hong, Qingyang
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 808 - 812
  • [3] Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity
    Passban, Peyman
    Hokamp, Chris
    Way, Andy
    Liu, Qun
    BALTIC JOURNAL OF MODERN COMPUTING, 2016, 4 (02): : 129 - 140
  • [4] Syntactically lexicalized phrase-based SMT
    Hassan, Hany
    Sima'an, Khalil
    Way, Andy
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (07): : 1260 - 1273
  • [5] Introducing a translation dictionary into phrase-based SMT
    Okuma, Hideo
    Yamamoto, Hirofumi
    Sumita, Eiichiro
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (07): : 2051 - 2057
  • [6] Phrase Table Combination Based on Symmetrization of Word Alignment for Low-Resource Languages
    Budiwati, Sari Dewi
    Siagian, Al Hafiz Akbar Maulana
    Fatyanosa, Tirana Noor
    Aritsugi, Masayoshi
    APPLIED SCIENCES-BASEL, 2021, 11 (04): : 1 - 20
  • [7] Flattened Syntactical Phrase-Based Translation Model for SMT
    Chen, Qing
    Yao, Tianshun
    COMPUTER PROCESSING OF ORIENTAL LANGUAGES: LANGUAGE TECHNOLOGY FOR THE KNOWLEDGE-BASED ECONOMY, 2009, 5459 : 345 - 353
  • [8] Rule-based reordering constraints for phrase-based SMT
    Goh, Chooi-Ling
    Onishi, Takashi
    Sumita, Eiichiro
    Proceedings of the 15th International Conference of the European Association for Machine Translation, EAMT 2011, 2011, : 113 - 120
  • [9] IMPROVING CAPTIONING FOR LOW-RESOURCE LANGUAGES BY CYCLE CONSISTENCY
    Wu, Yike
    Zhao, Shiwan
    Chen, Jia
    Zhang, Ying
    Yuan, Xiaojie
    Su, Zhong
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 362 - 367
  • [10] Improving the performance of low-resource SMT using neural-inspired sentence generator
    Kumar, Nirmal
    Mrinalini, K.
    Vijayalakshmi, P.
    2018 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, AND SIGNAL PROCESSING (ICCCSP): SPECIAL FOCUS ON TECHNOLOGY AND INNOVATION FOR SMART ENVIRONMENT, 2018, : 198 - 201