Uncertainty decoding for DNN-HMM hybrid systems based on numerical sampling

被引：0

作者：

Huemmer, Christian ^{[1
]}

Maas, Roland ^{[1
]}

Schwarz, Andreas ^{[1
]}

Astudillo, Ramon Fernandez ^{[2
]}

Kellermann, Walter ^{[1
]}

机构：

[1] Univ Erlangen Nurnberg, Multimedia Commun & Signal Proc, Erlangen, Germany

[2] INESC ID Lisboa, Spoken Language Syst Lab, Lisbon, Portugal

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

robust speech recognition; observation uncertainty; numerical sampling; uncertainty decoding; DEEP NEURAL-NETWORKS; SPEECH; ADAPTATION;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this article, we propose an uncertainty decoding scheme for DNN-HMM hybrid systems based on numerical sampling. A finite set of samples is drawn from the estimated probability distribution of the acoustic features and subsequently passed through feature transformations/extensions and the deep neural network (DNN). Then, the nonlinearly-transformed feature samples are averaged at the output of the DNN in order to approximate the posterior distribution of the context-dependent Hidden Markov Model (HMM) states. This concept is experimentally verified for the REVERB challenge task using a reverberation-robust DNN-HMM hybrid system: The numerical sampling is performed in the logmelspec domain, where we estimate the posterior distribution of the acoustic features by combining coherence-based Wiener filtering and uncertainty propagation. The experimental results highlight the good performance of the proposed uncertainty decoding scheme with significantly increased recognition accuracy even for a small number of feature samples.

引用

页码：3556 / 3560

页数：5

共 50 条

[1] A NEW UNCERTAINTY DECODING SCHEME FOR DNN-HMM HYBRID SYSTEMS WITH MULTICHANNEL SPEECH ENHANCEMENT
Huemmer, Christian
Schwarz, Andreas
Maas, Roland
Barfuss, Hendrik
Astudillo, Ramon Fernandez
Kellermann, Walter
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5760 - 5764
[2] AN IMPROVED UNCERTAINTY DECODING SCHEME WITH WEIGHTED SAMPLES FOR MULTI-CHANNEL DNN-HMM HYBRID SYSTEMS
Huemmer, Christian
Astudillo, Ramon Fernandez
Kellermann, Walter
2017 HANDS-FREE SPEECH COMMUNICATIONS AND MICROPHONE ARRAYS (HSCMA 2017), 2017, : 31 - 35
[3] Combining hybrid DNN-HMM ASR systems with attention-based models using lattice rescoring
Li, Qiujia
Zhang, Chao
Woodland, Philip C.
SPEECH COMMUNICATION, 2023, 147 : 12 - 21
[4] Recognizing the content types of network traffic based on a hybrid DNN-HMM model
Tan, Xincheng
Xie, Yi
Ma, Haishou
Yu, Shunzheng
Hu, Jiankun
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2019, 142 : 51 - 62
[5] On quantifying the quality of acoustic models in hybrid DNN-HMM ASR
Dighe, Pranay
Asaei, Afsaneh
Bourlard, Herve
SPEECH COMMUNICATION, 2020, 119 : 24 - 35
[6] Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers
Ochiai, Tsubasa
Matsuda, Shigeki
Watanabe, Hideyuki
Lu, Xugang
Hori, Chiori
Kawai, Hisashi
Katagiri, Shigeru
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (10): : 2431 - 2443
[7] DNN-HMM based Automatic Speech Recognition for HRI Scenarios
Novoa, Jose
Wuth, Jorge
Pablo Escudero, Juan
Fredes, Josue
Mahu, Rodrigo
Becerra Yoma, Nestor
HRI '18: PROCEEDINGS OF THE 2018 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2018, : 150 - 159
[8] Phonotactic Language Recognition Based on DNN-HMM Acoustic Model
Liu, Wei-Wei
Cai, Meng
Yuan, Hua
Shi, Xiao-Bei
Zhang, Wei-Qiang
Liu, Jia
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 153 - +
[9] Syllable based DNN-HMM Cantonese Speech-to-Text System
Wong, Timothy
Li, Claire W. Y.
Lam, Sam
Chiu, Billy
Lu, Qin
Li, Minglei
Xiong, Dan
Yu, Roy S.
Ng, Vincent T. Y.
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3856 - 3862
[10] Hybrid Deep Neural Network - Hidden Markov Model (DNN-HMM) Based Speech Emotion Recognition
Li, Longfei
Zhao, Yong
Jiang, Dongmei
Zhang, Yanning
Wang, Fengna
Gonzalez, Isabel
Valentin, Enescu
Sahli, Hichem
2013 HUMAINE ASSOCIATION CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2013, : 312 - 317

← 1 2 3 4 5 →