WORD-LATTICE BASED SPOKEN-DOCUMENT INDEXING WITH STANDARD TEXT INDEXERS

被引:1
作者
Seide, Frank [1 ]
Thambiratnam, Kit [1 ]
Yu, Roger Peng [1 ]
机构
[1] Microsoft Res Asia, Beijing Sigma Ctr, Beijing 100080, Peoples R China
来源
2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS | 2008年
关键词
Audio Indexing; Word Lattice; Posterior; Full-Text Indexing;
D O I
10.1109/SLT.2008.4777898
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Indexing the spoken content of audio recordings requires automatic speech recognition, which is as of today not reliable. Unlike indexing text, we cannot reliably know from a speech recognizer whether a word is present at a given point in the audio; we can only obtain a probability for it. Correct use of these probabilities significantly improves spoken-document search accuracy. In this paper, we will first describe how to improve accuracy for "web-search style" (AND/phrase) queries into audio, by utilizing speech recognition alternates and word posterior probabilities based on word lattices. Then, we will present an end-to-end approach to doing so using standard text indexers, which by design cannot handle probabilities and unaligned alternates. We present a sequence of approximations that transform the numeric lattice-matching problem into a symbolic text-based one that can be implemented by a commercial full-text indexer. Experiments on a 170-hour lecture set show an accuracy improvement by 30-60% for phrase searches and by 130% for two-term AND queries, compared to indexing linear text.
引用
收藏
页码:293 / 296
页数:4
相关论文
共 13 条
[1]  
CHELBA C, 2005, P ACL 2005 ANN ARB, P2005
[2]  
EVERMANN G, 2004, P FALL 2004 RICH TRA
[3]  
GAROFOLO J, TREC 9 SPOKEN DOCUME
[4]  
GLASS J, 2004, P HLT NAACL 2004 WOR
[5]  
KAWAHARA T, 1997, P ICASSP 1997 MUN
[6]  
MANGU L, 2004, COMPUTER SPEECH LANG, V14
[7]  
PADMANABHAN M, 2002, IEEE T SPEECH AUDIO, V10
[8]  
Saraclar M, 2004, HLT-NAACL 2004: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, P129
[9]  
SEIDE F, P ASRU 2007 KYOT
[10]  
WESSEL F, 2000, P ICASSP 2000 IST