Incorporating Linguistic Information to Statistical Word-Level Alignment

被引:0
|
作者
Cendejas, Eduardo [1 ]
Barcelo, Grettel [1 ]
Gelbukh, Alexander [1 ]
Sidorov, Grigori [1 ]
机构
[1] Natl Polytech Inst, Ctr Res Comp, Mexico City, DF, Mexico
来源
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, PROCEEDINGS | 2009年 / 5856卷
关键词
Parallel texts; word alignment; linguistic information; dictionary; cognates; semantic domains; morphological information;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Parallel texts are enriched by alignment algorithms, thus establishing a relationship between the structures of the implied languages. Depending on the alignment level, the enrichment can be performed on paragraphs, sentences or words, of the expressed content in the source language and its translation. There are two main approaches to perform word-level alignment: statistical or linguistic. Due to the dissimilar grammar rules the languages have, the statistical algorithms usually give lower precision. That is why the development of this type of algorithms is generally aimed at a specific language pair using linguistic techniques. A hybrid alignment system based on the combination of the two traditional approaches is presented in this paper. It provides user-friendly configuration and is adaptable to the computational environment. The system uses linguistic resources and procedures such as identification of cognates, morphological information, syntactic trees, dictionaries, and semantic domains. We show that the system outperforms existing algorithms.
引用
收藏
页码:387 / 394
页数:8
相关论文
共 17 条
  • [1] Investigating English-Chinese Word Level Alignment by Using Semantic Similarities and Linguistic Knowledge
    Huang, Fuwei
    2015 5TH INTERNATIONAL CONFERENCE ON APPLIED SOCIAL SCIENCE (ICASS 2015), PT 2, 2015, 81 : 212 - 216
  • [2] Linguistic-Relationships-Based Approach for Improving Word Alignment
    Phuoc Tran
    Dien Dinh
    Tan Le
    Nguyen, Long H. B.
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2017, 17 (01)
  • [3] A Hybrid Approach for Word Alignment with Statistical Modeling and Chunker
    Srivastava, Jyoti
    Sanyal, Sudip
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT I, 2015, 9041 : 570 - 581
  • [4] HMM word and phrase alignment for statistical machine translation
    Deng, Yonggang
    Byrne, William
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (03): : 494 - 507
  • [5] What types of word alignment improve statistical machine translation?
    Lambert, Patrik
    Petitrenaud, Simon
    Ma, Yanjun
    Way, Andy
    MACHINE TRANSLATION, 2012, 26 (04) : 289 - 323
  • [6] Syntactic Pattern Based Word Alignment for Statistical Machine Translation
    Le, Quang-Hung
    Le, Anh-Cuong
    INTERNATIONAL JOURNAL OF KNOWLEDGE AND SYSTEMS SCIENCE, 2014, 5 (03) : 36 - 45
  • [7] Bayesian Word Alignment and Phrase Table Training for Statistical Machine Translation
    Li, Zezhong
    Ikeda, Hideto
    Fukumoto, Junichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (07) : 1536 - 1543
  • [8] A Simple Approach to Use Bilingual Information Sources for Word Alignment
    Espla-Gomis, Miguel
    Sanchez-Martinez, Felipe
    Forcada, Mikel L.
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2012, (49): : 93 - 99
  • [9] Using Kazakh Morphology Information to Improve Word Alignment for SMT
    Kartbayev, Amandyk
    PROCEEDINGS OF THE SECOND INTERNATIONAL AFRO-EUROPEAN CONFERENCE FOR INDUSTRIAL ADVANCEMENT (AECIA 2015), 2016, 427 : 351 - 359
  • [10] Chinese-Vietnamese Word Alignment Method Based on Bidirectional RNN and Linguistic Features
    Gao, Shengxiang
    Zhu, Haodong
    Wang, Zhuo
    Yu, Zhengtao
    Wang, Xiaohan
    COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2018, 2019, 917 : 454 - 465