Automatic construction of parallel English-Chinese corpus for cross-language information retrieval

被引:0
|
作者
Chen, J [1 ]
Nie, JY [1 ]
机构
[1] Univ Montreal, Dept Informat & Rech Operat, Montreal, PQ H3C 3J7, Canada
来源
6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP | 2000年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A major obstacle to the construction of a probabilistic translation model is the lack of large parallel corpora. In this paper we first describe a parallel text mining system that finds parallel texts automatically on the Web. The generated Chinese-English parallel corpus is used to train a probabilistic translation model which translates queries for Chinese-English cross-language information retrieval (CLIR). We will discuss some problems in translation model training and show the preliminary CLIR results.
引用
收藏
页码:21 / 28
页数:8
相关论文
共 50 条
  • [21] An axiomatic approach to corpus-based cross-language information retrieval
    Razieh Rahimi
    Ali Montazeralghaem
    Azadeh Shakery
    Information Retrieval Journal, 2020, 23 : 191 - 215
  • [22] Corpus-based cross-language information retrieval in retrieval of highly relevant documents
    Talvensaari, Tuomas
    Juhola, Martti
    Laurikkala, Jorma
    Jarvelin, Kalervo
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2007, 58 (03): : 322 - 334
  • [23] An empirical comparison of translation disambiguation techniques for Chinese-English cross-language information retrieval
    Zhang, Ying
    Vines, Phil
    Zobel, Justin
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2006, 4182 : 666 - 672
  • [24] Indonesian-English Transitive Translation for Cross-Language Information Retrieval
    Adriani, Mirna
    Hayurani, Herika
    Sari, Syandra
    ADVANCES IN MULTILINGUAL AND MULTIMODAL INFORMATION RETRIEVAL, 2008, 5152 : 127 - 133
  • [25] Automatic Lexicon Acquisition for a Medical Cross-Language Information Retrieval System
    Marko, Kornel
    Schulz, Stefan
    Hahn, Udo
    CONNECTING MEDICAL INFORMATICS AND BIO-INFORMATICS, 2005, 116 : 829 - 834
  • [26] Study on cross-language information retrieval
    Si, Shen
    PROCEEDINGS OF 2008 INTERNATIONAL PRE-OLYMPIC CONGRESS ON COMPUTER SCIENCE, VOL I: COMPUTER SCIENCE AND ENGINEERING, 2008, : 6 - 10
  • [27] Cross-language multimedia information retrieval
    Flank, S
    6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP, 2000, : 13 - 20
  • [28] Cross-Language Information Retrieval with Latent Topic Models Trained on a Comparable Corpus
    Vulic, Ivan
    De Smet, Wim
    Moens, Marie-Francine
    INFORMATION RETRIEVAL TECHNOLOGY, 2011, 7097 : 37 - 48
  • [29] Improving Retrieval Performance Of English-Hindi Based Cross-Language Information Retrieval
    Varshney, Saurabh
    Bajpai, Jyoti
    PROCEEDINGS OF THE 2013 IEEE INTERNATIONAL CONFERENCE IN MOOC, INNOVATION AND TECHNOLOGY IN EDUCATION (MITE), 2013, : 300 - 305
  • [30] ParaMed: a parallel corpus for English-Chinese translation in the biomedical domain
    Liu, Boxiang
    Huang, Liang
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)