Dictionary-based techniques for cross-language information retrieval

被引:53
作者
Levow, GA
Oard, DW
Resnik, P
机构
[1] Univ Chicago, Dept Comp Sci, Chicago, IL 60637 USA
[2] Univ Maryland, Coll Informat Studies, College Pk, MD 20742 USA
[3] Univ Maryland, Inst Adv Comp Studies, College Pk, MD 20742 USA
[4] Univ Maryland, Dept Linguist, College Pk, MD 20742 USA
基金
美国国家科学基金会;
关键词
cross-language information retrieval; ranked retrieval; dictionary-based translation;
D O I
10.1016/j.ipm.2004.06.012
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cross-language information retrieval (CLIR) systems allow users to find documents written in different languages 14 from that of their query. Simple knowledge structures such as bilingual term lists have proven to be a remarkably useful basis for bridging that language gap. A broad array of dictionary-based techniques have demonstrated utility, but comparison across techniques has been difficult because evaluation results often span only a limited range of conditions. This article identifies the key issues in dictionary-based CLIR, develops unified frameworks for term selection and term translation that help to explain the relationships among existing techniques, and illustrates the effect of those techniques using four contrasting languages for systematic experiments with a uniform query translation architecture. Key results include identification of a previously unseen dependence of pre- and post-translation expansion on orthographic cognates and development of a query-specific measure for translation fanout that helps to explain the utility of structured query methods. (C) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:523 / 547
页数:25
相关论文
共 43 条
  • [1] Aljlayl M., 2002, Proceedings of the Eleventh International Conference on Information and Knowledge Management. CIKM 2002, P340, DOI 10.1145/584792.584848
  • [2] *ALP, 1966, LANG MACH COMP TRANS
  • [3] [Anonymous], P 16 ANN INT ACM SIG
  • [4] Deictic codes for the embodiment of cognition
    Ballard, DH
    Hayhoe, MM
    Pook, PK
    Rao, RPN
    [J]. BEHAVIORAL AND BRAIN SCIENCES, 1997, 20 (04) : 723 - +
  • [5] Ballesteros L, 1996, LECT NOTES COMPUT SC, V1134, P791, DOI 10.1007/BFb0034731
  • [6] Barras C., 1998, P 1 INT C LANG RES E, P1373
  • [7] BEESLEY K, 1998, P 6 INT C EXH MULT C
  • [8] BRASCHLER M, 2001, LNCS, V2069, P140
  • [9] Brown P. F., 1990, Computational Linguistics, V16, P79
  • [10] BUCKLEY GJ, 1994, TECHNOL DISABIL, V3, P69