Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous Speech

被引:0
作者
Weng, Chao [1 ]
Juang, Biing-Hwang [1 ]
Povey, Daniel
机构
[1] Georgia Inst Technol, Ctr Signal & Image Proc, Atlanta, GA 30332 USA
来源
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 | 2012年
关键词
LVCSR; keyword spotting; DT; non-uniform criteria; WFST; OPTIMIZATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we investigate the feasibility of applying our prior works on discriminative training (DT) using non-uniform criteria to a keyword spotting task on spontaneous conversational speech. One of DT methods, minimum classification error (MCE), is recast and efficiently implemented in the weighted finite state transducer (WFST) framework to fit a keyword spotting task. To validate our approach, we evaluate it on a conversational speech task, the credit card use subset of Switchboard, in both kinds of keyword spotting scenarios: one is when a large vocabulary continuous speech recognition (LVCSR) decoder is available, the other is when a simple word-loop grammar of limited vocabulary is used. The results show our approach performs well in both cases, achieving 2.77% and 3.15% figure of merits (FOMs) absolute improvements over the baseline respectively.
引用
收藏
页码:558 / 561
页数:4
相关论文
共 16 条
[1]  
[Anonymous], IEEE
[2]   Non-uniform error criteria for automatic pattern and speech recognition [J].
Fu, Qiang ;
Mansjur, Dwi Sianto ;
Juang, Biing-Hwang .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :1853-1856
[3]   How may I help you? [J].
Gorin, AL ;
Parker, BA ;
Sachs, RM ;
Wilpon, JG .
THIRD IEEE WORKSHOP ON INTERACTIVE VOICE TECHNOLOGY FOR TELECOMMUNICATIONS APPLICATIONS - IVTTA-96, PROCEEDINGS, 1996, :57-60
[4]   Automatic recognition and understanding of spoken language - A first step toward natural human-machine communication [J].
Juang, BH ;
Furui, S .
PROCEEDINGS OF THE IEEE, 2000, 88 (08) :1142-1165
[5]   DISCRIMINATIVE LEARNING FOR MINIMUM ERROR CLASSIFICATION [J].
JUANG, BH ;
KATAGIRI, S .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1992, 40 (12) :3043-3054
[6]  
JUANG BH, 1998, P 16 ICA 135 M ASA, P617
[7]   Weighted finite-state transducers in speech recognition [J].
Mohri, M ;
Pereira, F ;
Riley, M .
COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01) :69-88
[8]  
Povey D, 2012, INT CONF ACOUST SPEE, P4213, DOI 10.1109/ICASSP.2012.6288848
[9]   Discriminative utterance verification for connected digits recognition [J].
Rahim, MG ;
Lee, CH ;
Juang, BH .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (03) :266-277
[10]   KEYWORD DETECTION IN CONVERSATIONAL SPEECH UTTERANCES USING HIDDEN MARKOV MODEL-BASED CONTINUOUS SPEECH RECOGNITION [J].
ROSE, RC .
COMPUTER SPEECH AND LANGUAGE, 1995, 9 (04) :309-333