A model of speech recognition for hearing-impaired listeners based on deep learning

被引:8
|
作者
Rossbach, Jana [1 ]
Kollmeier, Birger [2 ]
Meyer, Bernd T. [1 ]
机构
[1] Carl von Ossietzky Univ Oldenburg, Commun Acoust & Cluster Excellence Hearing4all, D-26111 Oldenburg, Germany
[2] Carl von Ossietzky Univ Oldenburg, Med Phys & Cluster Excellence Hearing4all, D-26111 Oldenburg, Germany
关键词
INTELLIGIBILITY INDEX; RECEPTION THRESHOLD; FLUCTUATING NOISE; PREDICTION; ENVELOPE; PERCEPTION; MODULATION; ALGORITHM; MASKING;
D O I
10.1121/10.0009411
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automatic speech recognition (ASR) has made major progress based on deep machine learning, which motivated the use of deep neural networks (DNNs) as perception models and specifically to predict human speech recognition (HSR). This study investigates if a modeling approach based on a DNN that serves as phoneme classifier [Spille, Ewert, Kollmeier, and Meyer (2018). Comput. Speech Lang. 48, 51-66] can predict HSR for subjects with different degrees of hearing loss when listening to speech embedded in different complex noises. The eight noise signals range from simple stationary noise to a single competing talker and are added to matrix sentences, which are presented to 20 hearing-impaired (HI) listeners (categorized into three groups with different types of age-related hearing loss) to measure their speech recognition threshold (SRT), i.e., the signal-to-noise ratio with 50% word recognition rate. These are compared to responses obtained from the ASR-based model using degraded feature representations that take into account the individual hearing loss of the participants captured by a pure-tone audiogram. Additionally, SRTs obtained from eight normal-hearing (NH) listeners are analyzed. For NH subjects and three groups of HI listeners, the average SRT prediction error is below 2 dB, which is lower than the errors of the baseline models. (C) 2022 Authos(s).
引用
收藏
页码:1417 / 1427
页数:11
相关论文
共 50 条
  • [31] Spectrotemporal modulation sensitivity for hearing-impaired listeners: Dependence on carrier center frequency and the relationship to speech intelligibility
    Mehraei, Golbarg
    Gallun, Frederick J.
    Leek, Marjorie R.
    Bernstein, Joshua G. W.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2014, 136 (01) : 301 - 316
  • [32] Native and Non-native Speech Perception by Hearing-Impaired Listeners in Noise- and Speech Maskers
    Kilman, Lisa
    Zekveld, Adriana
    Hallgren, Mathias
    Ronnberg, Jerker
    TRENDS IN HEARING, 2015, 19
  • [33] Hearing-impaired listeners show increased audiovisual benefit when listening to speech in noise
    Puschmann, Sebastian
    Daeglau, Mareike
    Stropahl, Maren
    Mirkovic, Bojana
    Rosemann, Stephanie
    Thiel, Christiane M.
    Debener, Stefan
    NEUROIMAGE, 2019, 196 : 261 - 268
  • [34] Microscopic prediction of speech recognition for listeners with normal hearing in noise using an auditory model
    Juergens, Tim
    Brand, Thomas
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 126 (05) : 2635 - 2648
  • [35] Modeling speech intelligibility in quiet and noise in listeners with normal and impaired hearing
    Rhebergen, Koenraad S.
    Lyzenga, Johannes
    Dreschler, Wouter A.
    Festen, Joost M.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 127 (03) : 1570 - 1583
  • [36] Perceptual and Model-Based Evaluation of Ideal Time-Frequency Noise Reduction in Hearing-Impaired Listeners
    Koning, Raphael
    Bruce, Ian C.
    Denys, Sam
    Wouters, Jan
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2018, 26 (03) : 687 - 697
  • [37] A computational study of auditory models in music recognition tasks for normal-hearing and hearing-impaired listeners
    Friedrichs, Klaus
    Bauer, Nadja
    Martin, Rainer
    Weihs, Claus
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2017,
  • [38] Mandarin lexical tone recognition in sensorineural hearing-impaired listeners and cochlear implant users
    Wang, Shuo
    Liu, Bo
    Zhang, Hua
    Dong, Ruijuan
    Mannell, Robert
    Newall, Philip
    Chen, Xueqing
    Qi, Beier
    Zhang, Luo
    Han, Demin
    ACTA OTO-LARYNGOLOGICA, 2013, 133 (01) : 47 - 54
  • [39] Mechanisms of Spectrotemporal Modulation Detection for Normal- and Hearing-Impaired Listeners
    Ponsot, Emmanuel
    Varnet, Leo
    Wallaert, Nicolas
    Daoud, Elza
    Shamma, Shihab A.
    Lorenzi, Christian
    Neri, Peter
    TRENDS IN HEARING, 2021, 25
  • [40] Temporal Fine-Structure Coding and Lateralized Speech Perception in Normal-Hearing and Hearing-Impaired Listeners
    Locsei, Gusztav
    Pedersen, Julie H.
    Laugesen, Soren
    Santurette, Sebastien
    Dau, Torsten
    MacDonald, Ewen N.
    TRENDS IN HEARING, 2016, 20