Automatic construction of parallel English-Chinese corpus for cross-language information retrieval

被引:0
作者
Chen, J [1 ]
Nie, JY [1 ]
机构
[1] Univ Montreal, Dept Informat & Rech Operat, Montreal, PQ H3C 3J7, Canada
来源
6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP | 2000年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A major obstacle to the construction of a probabilistic translation model is the lack of large parallel corpora. In this paper we first describe a parallel text mining system that finds parallel texts automatically on the Web. The generated Chinese-English parallel corpus is used to train a probabilistic translation model which translates queries for Chinese-English cross-language information retrieval (CLIR). We will discuss some problems in translation model training and show the preliminary CLIR results.
引用
收藏
页码:21 / 28
页数:8
相关论文
共 50 条
  • [31] Teaching Design for Translation Based on English-Chinese Parallel Corpus
    Sun, Lihua
    Li, Zhiyuan
    2017 2ND EBMEI INTERNATIONAL CONFERENCE ON EDUCATION, INFORMATION AND MANAGEMENT (EBMEI-EIM 2017, 2017, 85 : 57 - 60
  • [32] Correspondence Analysis of English-Chinese Contrast Relationship and Adverbial Module in the Construction of Parallel Translation Corpus
    Deng, Tao
    2018 4TH INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT AND INFORMATION TECHNOLOGY (ICEMIT 2018), 2018, : 870 - 873
  • [33] Using KCCA for Japanese–English cross-language information retrieval and document classification
    Yaoyong Li
    John Shawe-Taylor
    Journal of Intelligent Information Systems, 2006, 27 : 117 - 133
  • [34] On Arabic-English cross-language information retrieval:: A machine translation approach
    Aljlayl, M
    Frieder, O
    Grossman, D
    INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: CODING AND COMPUTING, PROCEEDINGS, 2002, : 2 - 7
  • [35] Exploiting Wikipedia API for Hindi-English Cross-Language Information Retrieval
    Sharma, Vijay Kumar
    Mittal, Namita
    TWELFTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2016 / TWELFTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2016 / TWELFTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2016, 2016, 89 : 434 - 440
  • [36] Japanese/English Cross-Language Information Retrieval: Exploration of Query Translation and Transliteration
    Atsushi Fujii
    Tetsuya Ishikawa
    Computers and the Humanities, 2001, 35 : 389 - 420
  • [37] Parallel sentence extraction to improve cross-language information retrieval from Wikipedia
    Cheon, Juryong
    Ko, Youngjoong
    JOURNAL OF INFORMATION SCIENCE, 2021, 47 (02) : 281 - 293
  • [38] Cross-language Information Retrieval Based on Multiple Information
    Liu, Pengyuan
    Zheng, Zhijun
    Su, Qi
    2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 623 - 626
  • [39] A study on automatic creation of a comparable document collection in cross-language information retrieval
    Talvensaari, Tuomas
    Laurikkala, Jorma
    Jarvelin, Kalervo
    Juhola, Martti
    JOURNAL OF DOCUMENTATION, 2006, 62 (03) : 372 - 387
  • [40] Prediction of performance of cross-language information retrieval using automatic evaluation of translation
    Kishida, Kazuaki
    LIBRARY & INFORMATION SCIENCE RESEARCH, 2008, 30 (02) : 138 - 144