Approaches to reduce the effects of OOV queries on indexed spoken audio

被引:34
作者
Logan, B [1 ]
Van Thong, JM [1 ]
Moreno, PJ [1 ]
机构
[1] Hewlett Packard Labs, Cambridge, MA 02142 USA
关键词
audio indexing; speech indexing; spoken document retrieval; out-of-vocabulary words;
D O I
10.1109/TMM.2005.854429
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present several novel approaches to the Out of Vocabulary (OOV) query problem for spoken audio: indexing based on syllable-like units called particles and query expansion according to acoustic confusability for a word index. We also examine linear and OOV-based combination of indexing schemes. We experiment on 75 h of broadcast news, comparing our techniques to a word index, a phoneme index and a phoneme index queried with phoneme sequences. Our results show that our approaches are superior to both a word index and a phoneme index for OOV words, and have comparable performance to the sequence of phonemes scheme. The particle system has worse performance than the acoustic query expansion scheme. The best system uses word queries for in-vocabulary words and a linear combination of the phoneme sequence scheme and acoustic query expansion for OOV words. Using the best possible weights for linear combination, this system improves the average precision from 0.35 for a word index to 0.40, a result only obtainable if the weights could be learnt on a development query set. The next best system used a word index for in-vocabulary words and the phoneme sequence system otherwise and had average precision of 0.39.
引用
收藏
页码:899 / 906
页数:8
相关论文
共 21 条
[1]  
ABBERLEY D, 1999, P 8 TEXT RETR C TREC, P128
[2]  
Buckley C., 2000, P 23 ANN INT ACM SIG, P33, DOI DOI 10.1145/345508.345543
[3]  
BURROWS M, 1998, Patent No. 5745899
[4]  
CHANG SF, 2000, SIGN PROC COMMUN SER, V2, P559
[5]  
CLEMENTS M, 2001, P 20 ANN AVIOS C
[6]   A multistage algorithm for spotting new words in speech [J].
Dharanipragada, S ;
Roukos, S .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (08) :542-550
[7]  
James DA, 1994, P IEEE INT C AC SPEE, P279
[8]  
Jones GJF, 1996, P 19 ANN INT ACM SIG, P30
[9]  
Kemp Thomas, 1998, P INT C SPOK LANG PR, P1839
[10]  
LOGAN B, 2000, P INT C SPOK LANG PR