Non-intrusive binaural speech recognition prediction for hearing aid processing

被引:0
作者
Rossbach, Jana [1 ]
Westhausen, Nils L. [1 ]
Kayser, Hendrik [2 ]
Meyer, Bernd T. [1 ]
机构
[1] Carl von Ossietzky Univ Oldenburg, Commun Acoust & Cluster Excellence Hearing4all, Oldenburg, Germany
[2] Carl von Ossietzky Univ Oldenburg, Auditory Signal Proc & Hearing Devices & Cluster E, Oldenburg, Germany
关键词
Speech recognition prediction; Binaural; Non-intrusive; Deep neural network; INTELLIGIBILITY; CHALLENGE; NOISE;
D O I
10.1016/j.specom.2025.103202
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Hearing aids (HAs) often feature different signal processing algorithms to optimize speech recognition (SR) in a given acoustic environment. In this paper, we explore if models that predict SR performance of hearing- impaired (HI), aided users are applicable to automatically select the best algorithm. To this end, SR experiments are conducted with 19 HI subjects who are aided with an open-source HA. Listeners' SR is measured in virtual, complex acoustic scenes with two distinct noise conditions using the different speech enhancement strategies implemented in this HA. For model-based selection, we apply a PHOneme-based Binaural Intelligibility model (PHOBI) based on our previous work and extended with a component for simulating hearing loss. The non- intrusive model utilizes a deep neural network to predict phone probabilities; the deterioration of these phone representations in the presence of noise or generally signal degradation is quantified and used as model output. PHOBI model is trained with 960 h of English speech signals, a broad range of noise signals and room impulse responses. The performance of model-based algorithm selection is measured with two metrics: (i) Its ability to rank the HA algorithms in the order of subjective SR results and (ii) the SR difference between the measured best algorithm and the model-based selection (4SR). Results are compared to selections obtained with one non-intrusive and two intrusive models. PHOBI outperforms the non-intrusive and one of the intrusive models in both noise conditions, achieving significantly higher correlations (r = 0.63 and 0.80). 4 SR scores are significantly lower (better) compared to the non-intrusive baseline (3.5% and 4.6% against 8.6% and 9.8%, respectively). The results in terms of 4 SR between PHOBI and the intrusive models are statistically not different, although PHOBI operates on the observed signal alone and does not require a clean reference signal.
引用
收藏
页数:10
相关论文
共 70 条
  • [51] Using a blind EC mechanism for modelling the interaction between binaural and temporal speech processing
    Roettges, Saskia
    Hauth, Christopher F.
    Rennies, Jan
    Brand, Thomas
    [J]. ACTA ACUSTICA, 2022, 6
  • [52] Multilingual non-intrusive binaural intelligibility prediction based on phone classification
    Rossbach, Jana
    Wagener, Kirsten C.
    Meyer, Bernd T.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2025, 89
  • [53] NON-INTRUSIVE BINAURAL PREDICTION OF SPEECH INTELLIGIBILITY BASED ON PHONEME CLASSIFICATION
    Rossbach, Jana
    Roettges, Saskia
    Hauth, Christopher F.
    Brand, Thomas
    Meyer, Bernd T.
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 396 - 400
  • [54] Santos JF, 2014, 2014 14TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), P55, DOI 10.1109/IWAENC.2014.6953337
  • [55] Schadler M.R., 2020, P DAGA DTSCH GES FU, P908
  • [56] Tinnitus with a Normal Audiogram: Physiological Evidence for Hidden Hearing Loss and Computational Model
    Schaette, Roland
    McAlpine, David
    [J]. JOURNAL OF NEUROSCIENCE, 2011, 31 (38) : 13452 - 13457
  • [57] The performance of an automatic acoustic-based program classifier compared to hearing aid users' manual selection of listening programs
    Searchfield, Grant D.
    Linford, Tania
    Kobayashi, Kei
    Crowhen, David
    Latzel, Matthias
    [J]. INTERNATIONAL JOURNAL OF AUDIOLOGY, 2018, 57 (03) : 201 - 212
  • [58] Predicting speech intelligibility with deep neural networks
    Spille, Constantin
    Ewert, Stephan D.
    Kollmeier, Birger
    Meyer, Bernd T.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2018, 48 : 51 - 66
  • [59] An Algorithm for Intelligibility Prediction of Time-Frequency Weighted Noisy Speech
    Taal, Cees H.
    Hendriks, Richard C.
    Heusdens, Richard
    Jensen, Jesper
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (07): : 2125 - 2136
  • [60] Unsupervised Uncertainty Measures of Automatic Speech Recognition for Non-intrusive Speech Intelligibility Prediction
    Tu, Zehai
    Ma, Ning
    Barker, Jon
    [J]. INTERSPEECH 2022, 2022, : 3493 - 3497