A model of speech recognition for hearing-impaired listeners based on deep learning

被引：8

作者：

Rossbach, Jana ^{[1
]}

Kollmeier, Birger ^{[2
]}

Meyer, Bernd T. ^{[1
]}

机构：

[1] Carl von Ossietzky Univ Oldenburg, Commun Acoust & Cluster Excellence Hearing4all, D-26111 Oldenburg, Germany

[2] Carl von Ossietzky Univ Oldenburg, Med Phys & Cluster Excellence Hearing4all, D-26111 Oldenburg, Germany

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2022年 / 151卷 / 03期

关键词：

INTELLIGIBILITY INDEX; RECEPTION THRESHOLD; FLUCTUATING NOISE; PREDICTION; ENVELOPE; PERCEPTION; MODULATION; ALGORITHM; MASKING;

D O I：

10.1121/10.0009411

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Automatic speech recognition (ASR) has made major progress based on deep machine learning, which motivated the use of deep neural networks (DNNs) as perception models and specifically to predict human speech recognition (HSR). This study investigates if a modeling approach based on a DNN that serves as phoneme classifier [Spille, Ewert, Kollmeier, and Meyer (2018). Comput. Speech Lang. 48, 51-66] can predict HSR for subjects with different degrees of hearing loss when listening to speech embedded in different complex noises. The eight noise signals range from simple stationary noise to a single competing talker and are added to matrix sentences, which are presented to 20 hearing-impaired (HI) listeners (categorized into three groups with different types of age-related hearing loss) to measure their speech recognition threshold (SRT), i.e., the signal-to-noise ratio with 50% word recognition rate. These are compared to responses obtained from the ASR-based model using degraded feature representations that take into account the individual hearing loss of the participants captured by a pure-tone audiogram. Additionally, SRTs obtained from eight normal-hearing (NH) listeners are analyzed. For NH subjects and three groups of HI listeners, the average SRT prediction error is below 2 dB, which is lower than the errors of the baseline models. (C) 2022 Authos(s).

引用

页码：1417 / 1427

页数：11

共 50 条

[11] Phoneme recognition in modulated maskers by normal-hearing and aided hearing-impaired listeners
Phatak, Sandeep A.
Grant, Ken W.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2012, 132 (03) : 1646 - 1654
[12] Auditory inspired machine learning techniques can improve speech intelligibility and quality for hearing-impaired listeners
Monaghan, Jessica J. M.
Goehring, Tobias
Yang, Xin
Bolner, Federico
Wang, Shangqiguo
Wright, Matthew C. M.
Bleeck, Stefan
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 141 (03) : 1985 - 1998
[13] Suprathreshold Auditory Processing and Speech Perception in Noise: Hearing-Impaired and Normal-Hearing Listeners
Summers, Van
Makashay, Matthew J.
Theodoroff, Sarah M.
Leek, Marjorie R.
JOURNAL OF THE AMERICAN ACADEMY OF AUDIOLOGY, 2013, 24 (04) : 274 - 292
[14] Predicting speech intelligibility in hearing-impaired listeners using a physiologically inspired auditory model
Zaar, Johannes
Carney, Laurel H.
HEARING RESEARCH, 2022, 426
[15] Comparing Binaural Pre-processing Strategies III: Speech Intelligibility of Normal-Hearing and Hearing-Impaired Listeners
Voelker, Christoph
Warzybok, Anna
Ernst, Stephan M. A.
TRENDS IN HEARING, 2015, 19
[16] Level variations in speech: Effect on masking release in hearing-impaired listeners
Reed, Charlotte M.
Desloge, Joseph G.
Braida, Louis D.
Perez, Zachary D.
Leger, Agnes C.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2016, 140 (01) : 102 - 113
[17] Comparison of speech recognition performance with and without a face mask between a basic and a premium hearing aid in hearing-impaired listeners
Seol, Hye Yoon
Jo, Mini
Yun, Heejung
Park, Jin Gyun
Byun, Hye Min
Moon, Il Joon
AMERICAN JOURNAL OF OTOLARYNGOLOGY, 2023, 44 (05)
[18] Automated Measurement of Speech Recognition, Reaction Time, and Speech Rate and Their Relation to Self-Reported Listening Effort for Normal-Hearing and Hearing-Impaired Listeners Using various Maskers
Holube, Inga
Taesler, Stefan
Ibelings, Saskia
Hansen, Martin
Ooster, Jasper
TRENDS IN HEARING, 2024, 28
[19] Evaluating a 3-factor listener model for prediction of speech intelligibility to hearing-impaired listeners
Huckvale, Mark
Hilkhuysen, Gaston
INTERSPEECH 2024, 2024, : 872 - 876
[20] Understanding Excessive SNR Loss in Hearing-Impaired Listeners
Grant, Ken W.
Walden, Therese C.
JOURNAL OF THE AMERICAN ACADEMY OF AUDIOLOGY, 2013, 24 (04) : 258 - 273

← 1 2 3 4 5 →