Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous Speech

被引：0

作者：

Weng, Chao ^{[1
]}

Juang, Biing-Hwang ^{[1
]}

Povey, Daniel

机构：

[1] Georgia Inst Technol, Ctr Signal & Image Proc, Atlanta, GA 30332 USA

来源：

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 | 2012年

关键词：

LVCSR; keyword spotting; DT; non-uniform criteria; WFST; OPTIMIZATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, we investigate the feasibility of applying our prior works on discriminative training (DT) using non-uniform criteria to a keyword spotting task on spontaneous conversational speech. One of DT methods, minimum classification error (MCE), is recast and efficiently implemented in the weighted finite state transducer (WFST) framework to fit a keyword spotting task. To validate our approach, we evaluate it on a conversational speech task, the credit card use subset of Switchboard, in both kinds of keyword spotting scenarios: one is when a large vocabulary continuous speech recognition (LVCSR) decoder is available, the other is when a simple word-loop grammar of limited vocabulary is used. The results show our approach performs well in both cases, achieving 2.77% and 3.15% figure of merits (FOMs) absolute improvements over the baseline respectively.

引用

页码：558 / 561

页数：4

共 16 条

[1]

[Anonymous], IEEE

[2] Non-uniform error criteria for automatic pattern and speech recognition [J].

Fu, Qiang ;

Mansjur, Dwi Sianto ;

Juang, Biing-Hwang .

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :1853-1856

[3] How may I help you? [J].

Gorin, AL ;

Parker, BA ;

Sachs, RM ;

Wilpon, JG .

THIRD IEEE WORKSHOP ON INTERACTIVE VOICE TECHNOLOGY FOR TELECOMMUNICATIONS APPLICATIONS - IVTTA-96, PROCEEDINGS, 1996, :57-60

[4] Automatic recognition and understanding of spoken language - A first step toward natural human-machine communication [J].

Juang, BH ;

Furui, S .

PROCEEDINGS OF THE IEEE, 2000, 88 (08) :1142-1165

[5] DISCRIMINATIVE LEARNING FOR MINIMUM ERROR CLASSIFICATION [J].

JUANG, BH ;

KATAGIRI, S .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1992, 40 (12) :3043-3054

[6]

JUANG BH, 1998, P 16 ICA 135 M ASA, P617

[7] Weighted finite-state transducers in speech recognition [J].

Mohri, M ;

Pereira, F ;

Riley, M .

COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01) :69-88

[8]

Povey D, 2012, INT CONF ACOUST SPEE, P4213, DOI 10.1109/ICASSP.2012.6288848

[9] Discriminative utterance verification for connected digits recognition [J].

Rahim, MG ;

Lee, CH ;

Juang, BH .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (03) :266-277

[10] KEYWORD DETECTION IN CONVERSATIONAL SPEECH UTTERANCES USING HIDDEN MARKOV MODEL-BASED CONTINUOUS SPEECH RECOGNITION [J].

ROSE, RC .

COMPUTER SPEECH AND LANGUAGE, 1995, 9 (04) :309-333

← 1 2 →