A model of speech recognition for hearing-impaired listeners based on deep learning

被引:8
|
作者
Rossbach, Jana [1 ]
Kollmeier, Birger [2 ]
Meyer, Bernd T. [1 ]
机构
[1] Carl von Ossietzky Univ Oldenburg, Commun Acoust & Cluster Excellence Hearing4all, D-26111 Oldenburg, Germany
[2] Carl von Ossietzky Univ Oldenburg, Med Phys & Cluster Excellence Hearing4all, D-26111 Oldenburg, Germany
关键词
INTELLIGIBILITY INDEX; RECEPTION THRESHOLD; FLUCTUATING NOISE; PREDICTION; ENVELOPE; PERCEPTION; MODULATION; ALGORITHM; MASKING;
D O I
10.1121/10.0009411
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automatic speech recognition (ASR) has made major progress based on deep machine learning, which motivated the use of deep neural networks (DNNs) as perception models and specifically to predict human speech recognition (HSR). This study investigates if a modeling approach based on a DNN that serves as phoneme classifier [Spille, Ewert, Kollmeier, and Meyer (2018). Comput. Speech Lang. 48, 51-66] can predict HSR for subjects with different degrees of hearing loss when listening to speech embedded in different complex noises. The eight noise signals range from simple stationary noise to a single competing talker and are added to matrix sentences, which are presented to 20 hearing-impaired (HI) listeners (categorized into three groups with different types of age-related hearing loss) to measure their speech recognition threshold (SRT), i.e., the signal-to-noise ratio with 50% word recognition rate. These are compared to responses obtained from the ASR-based model using degraded feature representations that take into account the individual hearing loss of the participants captured by a pure-tone audiogram. Additionally, SRTs obtained from eight normal-hearing (NH) listeners are analyzed. For NH subjects and three groups of HI listeners, the average SRT prediction error is below 2 dB, which is lower than the errors of the baseline models. (C) 2022 Authos(s).
引用
收藏
页码:1417 / 1427
页数:11
相关论文
共 50 条
  • [1] An algorithm to improve speech recognition in noise for hearing-impaired listeners
    Healy, Eric W.
    Yoho, Sarah E.
    Wang, Yuxuan
    Wang, DeLiang
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 134 (04) : 3029 - 3038
  • [2] Personalized prediction of speech intelligibility for hearing-impaired listeners using a physiological model of the human ear
    Kou, Yinxin
    Chen, Wei
    Wang, Jie
    Liu, Wen
    Yang, Shanguo
    Liu, Houguang
    APPLIED ACOUSTICS, 2024, 221
  • [3] Effects of reverberation on speech intelligibility in noise for hearing-impaired listeners
    Cueille, Raphael
    Lavandier, Mathieu
    Grimault, Nicolas
    ROYAL SOCIETY OPEN SCIENCE, 2022, 9 (08):
  • [4] SPEECH ENHANCEMENT FOR HEARING-IMPAIRED LISTENERS USING DEEP NEURAL NETWORKS WITH AUDITORY-MODEL BASED FEATURES
    Goehring, Tobias
    Yang, Xin
    Monaghan, Jessica J. M.
    Bleeck, Stefan
    2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 2300 - 2304
  • [5] Speech-cue transmission by an algorithm to increase consonant recognition in noise for hearing-impaired listeners
    Healy, Eric W.
    Yoho, Sarah E.
    Wang, Yuxuan
    Apoux, Frederic
    Wang, DeLiang
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2014, 136 (06) : 3325 - 3336
  • [6] A deep learning based segregation algorithm to increase speech intelligibility for hearing-impaired listeners in reverberant-noisy conditions
    Zhao, Yan
    Wang, DeLiang
    Johnson, Eric M.
    Healy, Eric W.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 144 (03) : 1627 - 1637
  • [7] Spatio-temporal Integration of Speech Reflections in Hearing-Impaired Listeners
    Rennies, Jan
    Warzybok, Anna
    Kollmeier, Birger
    Brand, Thomas
    TRENDS IN HEARING, 2022, 26
  • [8] Modelling binaural unmasking and the intelligibility of speech in noise and reverberation for normal-hearing and hearing-impaired listeners
    Vicente, Thibault
    Buchholz, Jorg M.
    Lavandier, Mathieu
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2021, 150 (05) : 3275 - 3287
  • [9] Spectrotemporal Modulation Sensitivity as a Predictor of Speech Intelligibility for Hearing-Impaired Listeners
    Bernstein, Joshua G. W.
    Mehraei, Golbarg
    Shamma, Shihab
    Gallun, Frederick J.
    Theodoroff, Sarah M.
    Leek, Marjorie R.
    JOURNAL OF THE AMERICAN ACADEMY OF AUDIOLOGY, 2013, 24 (04) : 293 - 306
  • [10] Phoneme recognition in vocoded maskers by normal-hearing and aided hearing-impaired listeners
    Phatak, Sandeep A.
    Grant, Ken W.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2014, 136 (02) : 859 - 866