Dynamic Combination of Automatic Speech Recognition Systems by Driven Decoding

被引：5

作者：

Lecouteux, Benjamin ^{[1
]}

Linares, Georges ^{[2
]}

Esteve, Yannick ^{[3
]}

Gravier, Guillaume ^{[4
]}

机构：

[1] LIG Univ Grenoble Alpes, GETALP Team, F-38041 Grenoble 9, France

[2] LIA Univ Avignon, Speech Proc Grp, F-84911 Avignon 9, France

[3] Univ Le Mans LIUM, Lab Informat, F-72085 Le Mans 9, France

[4] CNRS IRISA, F-35042 Rennes, France

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2013年 / 21卷 / 06期

关键词：

Automatic speech recognition; speech processing; system combination;

D O I：

10.1109/TASL.2013.2248716

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Combining automatic speech recognition (ASR) systems generally relies on the posterior merging of the outputs or on acoustic cross-adaptation. In this paper, we propose an integrated approach where outputs of secondary systems are integrated in the search algorithm of a primary one. In this driven decoding algorithm (DDA), the secondary systems are viewed as observation sources that should be evaluated and combined to others by a primary search algorithm. DDA is evaluated on a subset of the ESTER I corpus consisting of 4 hours of French radio broadcast news. Results demonstrate DDA significantly outperforms vote-based approaches: we obtain an improvement of 14.5% relative word error rate over the best single-systems, as opposed to the the 6.7% with a ROVER combination. An in-depth analysis of the DDA shows its ability to improve robustness (gains are greater in adverse conditions) and a relatively low dependency on the search algorithm. The application of DDA to both A* and beam-search-based decoder yields similar performances.

引用

页码：1251 / 1260

页数：10

共 31 条

[11] Hillard D., 2007, P HLT
[12] Hoffmeister B., 2006, P ICSLP
[13] Huet S., 2010, COMPUT SPEECH LANG
[14] Lecouteux B, 2007, INT CONF ACOUST SPEE, P341
[15] Lecouteux B., 2009, P INT C SPEECH COMM
[16] Generalized driven decoding for speech recognition system combination
Lecouteux, Benjamin
Linares, Georges
Esteve, Yannick
Gravier, Guillaume
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 1549 - +
[17] Integrating imperfect transcripts into speech recognition systems for building high-quality corpora
Lecouteux, Benjamin
Linares, Georges
Oger, Stanislas
[J]. COMPUTER SPEECH AND LANGUAGE, 2012, 26 (02) : 67 - 89
[18] Massonie D., 2005, P INTERSPEECH 05 LIS
[19] Mauclair J., 2006, P LREC 06 GEN IT MAY
[20] Nocera P., 2004, P SWIM LECT MAST SPE

← 1 2 3 4 →