DNN-based performance measures for predicting error rates in automatic speech recognition and optimizing hearing aid parameters

被引：14

作者：

Martinez, Angel Mario Castro ^{[1
,2
]}

Gerlach, Lukas ^{[2
,3
]}

Paya-Vaya, Guillermo ^{[2
,3
]}

Hermansky, Hynek ^{[4
]}

Ooster, Jasper ^{[1
,2
]}

Meyer, Bernd T. ^{[1
,2
]}

机构：

[1] Carl von Ossietzky Univ Oldenburg, Dept Med Phys & Akust, Oldenburg, Germany

[2] Exzellenzcluster Hearing4all, Oldenburg, Germany

[3] Leibniz Univ Hannover, Inst Microelect Syst, Hannover, Germany

[4] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD USA

来源：

SPEECH COMMUNICATION | 2019年 / 106卷

关键词：

Automatic speech recognition; Performance monitoring; Spatial filtering; Hearing aids;

D O I：

10.1016/j.specom.2018.11.006

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In several applications of machine listening, predicting how well an automatic speech recognition system will perform before the actual decoding enables the system to adapt to unseen acoustic characteristics dynamically. Feedback about speech quality, for instance, could allow modern hearing aids to select a speech source in complex acoustic scenes with the aim of enhancing the speech intelligibility of a target speaker. In this study, we look at different performance measures to estimate the word error rates of simulated behind-the-ear hearing aid signals and detect the azimuth angle of the target source in 180-degree spatial scenes. These measures derive from phoneme posterior probabilities produced by a deep neural network acoustic model. However, the more complex the model is, the more computationally expensive it becomes to obtain these measures; therefore, we assess how the model size affects prediction performance. Our findings suggest measures derived from smaller nets are suitable to predict error rates of more complex models reliably enough to be implemented in hearing aid hardware.

引用

页码：44 / 56

页数：13

共 3 条

[1] Prediction of speech intelligibility with DNN-based performance measures
Martinez, Angel Mario Castro
Spille, Constantin
Rossbach, Jana
Kollmeier, Birger
Meyer, Bernd T.
COMPUTER SPEECH AND LANGUAGE, 2022, 74
[2] PREDICTING ERROR RATES FOR UNKNOWN DATA IN AUTOMATIC SPEECH RECOGNITION
Meyer, Bernd T.
Mallidi, Harish
Kayser, Hendrik
Hermansky, Hynek
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5330 - 5334
[3] DNN-Based Speech Bandwidth Expansion and Its Application to Adding High-Frequency Missing Features for Automatic Speech Recognition of Narrowband Speech
Li, Kehuang
Huang, Zhen
Xu, Yong
Lee, Chin-Hui
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2578 - 2582

← 1 →