Cross-language speech retrieval: Establishing a baseline performance

被引:0
作者
Sheridan, P
Wechsler, M
Schauble, P
机构
来源
PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 1997年
关键词
D O I
10.1145/258525.258544
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present here the realisation of a cross-language speech retrieval system which retrieves German speech documents in response to user queries specified as French text. This has been achieved through the integration of two existing modules of the SPIDER information retrieval system, namely the query pseudo-translation module and the speech retrieval module. Our approach to cross-language retrieval uses an automatically contstructed corpus-based information structure called a similarity thesaurus. A similarity thesaurus can be constructed over any loosely comparable corpus - a parallel corpus is not necessary. The similarity thesaurus used here was constructed over a 330 MByte corpus of comparable German and French news stories. Our speech retrieval module is based on a speaker-independent phoneme recognizer and it indexes speech documents by N-grams of phonemic features. The speech retrieval module includes an additional probabilistic matching technique designed to aid retrieval from erroneous data such as the phonemic output of the speech recognition process. We have evaluated our cross-language speech retrieval system over a collection of 30 hours (3.4 GBytes) of German speech, comparing the effectiveness of French queries (cross-language) against performance on equivalent German queries (mono-lingual). It must be stressed that this work represents our first step in the direction of cross-language speech retrieval. Our aim here is to establish a baseline of performance on this task, against which we can then measure the success of our continuing research in this area.
引用
收藏
页码:99 / 108
页数:10
相关论文
共 50 条
  • [31] Different approaches to cross-language information retrieval
    Kraaij, W
    Pohlmann, R
    COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS 2000, 2001, (37): : 97 - 110
  • [32] Cross-Language Information Retrieval: An analysis of errors
    Ruiz, ME
    Srinivasan, P
    ASIS '98 - PROCEEDINGS OF THE 61ST ASIS ANNUAL MEETING, VOL 35, 1998: INFORMATION ACCESS IN THE GLOBAL INFORMATION ECONOMY, 1998, 35 : 153 - 165
  • [33] Cross-language information retrieval: the way ahead
    Gey, FC
    Kando, N
    Peters, C
    INFORMATION PROCESSING & MANAGEMENT, 2005, 41 (03) : 415 - 431
  • [34] Applying EuroWordNet to cross-language text retrieval
    Gonzalo, J
    Verdejo, F
    Peters, C
    Calzolari, N
    COMPUTERS AND THE HUMANITIES, 1998, 32 (2-3): : 185 - 207
  • [35] Cross-Language Information Retrieval in Web application
    Yu, SF
    Li, ZZ
    Thomassen, W
    ICCC2004: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION VOL 1AND 2, 2004, : 1198 - 1202
  • [36] A CROSS-LANGUAGE PERSPECTIVE ON SPEECH INFORMATION RATE
    Pellegrino, Francois
    Coupe, Christophe
    Marsico, Egidio
    LANGUAGE, 2011, 87 (03) : 539 - 558
  • [37] Language translation and media transformation in cross-language image retrieval
    Chen, Hsin-Hsi
    Chang, Yih-Chen
    DIGITAL LIBRARIES: ACHIEVEMENTS, CHALLENGES AND OPPORTUNITIES, PROCEEDINGS, 2006, 4312 : 350 - +
  • [38] CROSS-LANGUAGE STUDY OF SPEECH PATTERN LEARNING
    SIMON, C
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 61 : S64 - S64
  • [39] Prediction of performance of cross-language information retrieval using automatic evaluation of translation
    Kishida, Kazuaki
    LIBRARY & INFORMATION SCIENCE RESEARCH, 2008, 30 (02) : 138 - 144
  • [40] Toward cross-language and cross-media image retrieval
    Alvarez, C
    Oumohmed, AI
    Mignotte, M
    Nie, JY
    MULTILINGUAL INFORMATION ACCESS FOR TEXT, SPEECH AND IMAGES, 2005, 3491 : 676 - 687