Cross-language speech retrieval: Establishing a baseline performance

被引:0
|
作者
Sheridan, P
Wechsler, M
Schauble, P
机构
来源
PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 1997年
关键词
D O I
10.1145/258525.258544
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present here the realisation of a cross-language speech retrieval system which retrieves German speech documents in response to user queries specified as French text. This has been achieved through the integration of two existing modules of the SPIDER information retrieval system, namely the query pseudo-translation module and the speech retrieval module. Our approach to cross-language retrieval uses an automatically contstructed corpus-based information structure called a similarity thesaurus. A similarity thesaurus can be constructed over any loosely comparable corpus - a parallel corpus is not necessary. The similarity thesaurus used here was constructed over a 330 MByte corpus of comparable German and French news stories. Our speech retrieval module is based on a speaker-independent phoneme recognizer and it indexes speech documents by N-grams of phonemic features. The speech retrieval module includes an additional probabilistic matching technique designed to aid retrieval from erroneous data such as the phonemic output of the speech recognition process. We have evaluated our cross-language speech retrieval system over a collection of 30 hours (3.4 GBytes) of German speech, comparing the effectiveness of French queries (cross-language) against performance on equivalent German queries (mono-lingual). It must be stressed that this work represents our first step in the direction of cross-language speech retrieval. Our aim here is to establish a baseline of performance on this task, against which we can then measure the success of our continuing research in this area.
引用
收藏
页码:99 / 108
页数:10
相关论文
共 50 条
  • [21] Cross-Language Information Retrieval: An analysis of errors
    Ruiz, ME
    Srinivasan, P
    PROCEEDINGS OF THE ASIS ANNUAL MEETING, 1998, 35 : 153 - 165
  • [22] The BETTER Cross-Language Information Retrieval Datasets
    Soboroff, Ian
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 3047 - 3053
  • [23] Arabic Cross-Language Information Retrieval: A Review
    Elayeb, Bilel
    Bounhas, Ibrahim
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2016, 15 (03)
  • [24] Cross-language retrieval experiments at CLEF 2002
    Chen, AT
    ADVANCES IN CROSS-LANGUAGE INFORMATION RETRIEVAL, 2003, 2785 : 28 - 48
  • [25] Cross-language retrieval at the University of Twente and TNO
    Reidsma, D
    Hiemstra, D
    de Jong, F
    Kraaij, W
    ADVANCES IN CROSS-LANGUAGE INFORMATION RETRIEVAL, 2003, 2785 : 197 - 206
  • [26] Combining evidence for cross-language information retrieval
    Kamps, J
    Monz, C
    de Rijke, M
    ADVANCES IN CROSS-LANGUAGE INFORMATION RETRIEVAL, 2003, 2785 : 111 - 126
  • [27] Adaptive support for cross-language text retrieval
    De Luca, Ernesto William
    Nuernberger, Andreas
    ADAPTIVE HYPERMEDIA AND ADAPTIVE WEB-BASED SYSTEMS, PROCEEDINGS, 2006, 4018 : 425 - 429
  • [28] Disambiguation strategies for Cross-Language Information Retrieval
    Hiemstra, D
    de Jong, F
    RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, PROCEEDINGS, 1999, 1696 : 274 - 293
  • [29] Cross-Language Information Retrieval: An analysis of errors
    Ruiz, ME
    Srinivasan, P
    ASIS '98 - PROCEEDINGS OF THE 61ST ASIS ANNUAL MEETING, VOL 35, 1998: INFORMATION ACCESS IN THE GLOBAL INFORMATION ECONOMY, 1998, 35 : 153 - 165
  • [30] Applying EuroWordNet to cross-language text retrieval
    Gonzalo, J
    Verdejo, F
    Peters, C
    Calzolari, N
    COMPUTERS AND THE HUMANITIES, 1998, 32 (2-3): : 185 - 207