Uncertainty decoding for DNN-HMM hybrid systems based on numerical sampling

被引：0

作者：

Huemmer, Christian ^{[1
]}

Maas, Roland ^{[1
]}

Schwarz, Andreas ^{[1
]}

Astudillo, Ramon Fernandez ^{[2
]}

Kellermann, Walter ^{[1
]}

机构：

[1] Univ Erlangen Nurnberg, Multimedia Commun & Signal Proc, Erlangen, Germany

[2] INESC ID Lisboa, Spoken Language Syst Lab, Lisbon, Portugal

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

robust speech recognition; observation uncertainty; numerical sampling; uncertainty decoding; DEEP NEURAL-NETWORKS; SPEECH; ADAPTATION;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this article, we propose an uncertainty decoding scheme for DNN-HMM hybrid systems based on numerical sampling. A finite set of samples is drawn from the estimated probability distribution of the acoustic features and subsequently passed through feature transformations/extensions and the deep neural network (DNN). Then, the nonlinearly-transformed feature samples are averaged at the output of the DNN in order to approximate the posterior distribution of the context-dependent Hidden Markov Model (HMM) states. This concept is experimentally verified for the REVERB challenge task using a reverberation-robust DNN-HMM hybrid system: The numerical sampling is performed in the logmelspec domain, where we estimate the posterior distribution of the acoustic features by combining coherence-based Wiener filtering and uncertainty propagation. The experimental results highlight the good performance of the proposed uncertainty decoding scheme with significantly increased recognition accuracy even for a small number of feature samples.

引用

页码：3556 / 3560

页数：5