Lattice Indexing for Spoken Term Detection

被引:114
作者
Can, Dogan [1 ]
Saraclar, Murat [2 ]
机构
[1] Univ So Calif, Los Angeles, CA 90089 USA
[2] Bogazici Univ, Dept Elect & Elect Engn, TR-34342 Istanbul, Turkey
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2011年 / 19卷 / 08期
关键词
Factor automata; lattice indexing; speech retrieval (SR); spoken term detection (STD); weighted finite-state transducers; FINITE-STATE TRANSDUCERS; LANGUAGE;
D O I
10.1109/TASL.2011.2134087
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper considers the problem of constructing an efficient inverted index for the spoken term detection (STD) task. More specifically, we construct a deterministic weighted finite-state transducer storing soft-hits in the form of (utterance ID, start time, end time, posterior score) quadruplets. We propose a generalized factor transducer structure which retains the time information necessary for performing STD. The required information is embedded into the path weights of the factor transducer without disrupting the inherent optimality. We also describe how to index all substrings seen in a collection of raw automatic speech recognition lattices using the proposed structure. Our STD indexing/search implementation is built upon the OpenFst Library and is designed to scale well to large problems. Experiments on Turkish and English data sets corroborate our claims.
引用
收藏
页码:2338 / 2347
页数:10
相关论文
共 25 条
[11]  
Crochemore M., 2003, Jewels of stringology
[12]  
Kuich W., 1986, Semirings, Automata, Languages, DOI [10.1007/978-3-642-69959-7, DOI 10.1007/978-3-642-69959-7]
[13]  
MAMOU J, 2007, P 30 ANN INT ACM SIG, P615, DOI DOI 10.1145/1277741.1277847
[14]  
Mohri M, 1997, COMPUT LINGUIST, V23, P269
[15]   Weighted finite-state transducers in speech recognition [J].
Mohri, M ;
Pereira, F ;
Riley, M .
COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01) :69-88
[16]  
Mohri M., 2002, Journal of Automata, Languages and Combinatorics, V7, P321
[17]  
MOHRI M, 1996, P ECAI WORKSH EXT FI
[18]   General suffix automaton construction algorithm and space bounds [J].
Mohri, Mehryar ;
Moreno, Pedro ;
Weinstein, Eugene .
THEORETICAL COMPUTER SCIENCE, 2009, 410 (37) :3553-3562
[19]  
*NIST, 2006, SPOK TERM DET STD 20
[20]   Spoken term detection for Turkish Broadcast News [J].
Parlak, Siddika ;
Saraclar, Murat .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :5244-5247