Non-intrusive binaural speech recognition prediction for hearing aid processing

被引:0
作者
Rossbach, Jana [1 ]
Westhausen, Nils L. [1 ]
Kayser, Hendrik [2 ]
Meyer, Bernd T. [1 ]
机构
[1] Carl von Ossietzky Univ Oldenburg, Commun Acoust & Cluster Excellence Hearing4all, Oldenburg, Germany
[2] Carl von Ossietzky Univ Oldenburg, Auditory Signal Proc & Hearing Devices & Cluster E, Oldenburg, Germany
关键词
Speech recognition prediction; Binaural; Non-intrusive; Deep neural network; INTELLIGIBILITY; CHALLENGE; NOISE;
D O I
10.1016/j.specom.2025.103202
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Hearing aids (HAs) often feature different signal processing algorithms to optimize speech recognition (SR) in a given acoustic environment. In this paper, we explore if models that predict SR performance of hearing- impaired (HI), aided users are applicable to automatically select the best algorithm. To this end, SR experiments are conducted with 19 HI subjects who are aided with an open-source HA. Listeners' SR is measured in virtual, complex acoustic scenes with two distinct noise conditions using the different speech enhancement strategies implemented in this HA. For model-based selection, we apply a PHOneme-based Binaural Intelligibility model (PHOBI) based on our previous work and extended with a component for simulating hearing loss. The non- intrusive model utilizes a deep neural network to predict phone probabilities; the deterioration of these phone representations in the presence of noise or generally signal degradation is quantified and used as model output. PHOBI model is trained with 960 h of English speech signals, a broad range of noise signals and room impulse responses. The performance of model-based algorithm selection is measured with two metrics: (i) Its ability to rank the HA algorithms in the order of subjective SR results and (ii) the SR difference between the measured best algorithm and the model-based selection (4SR). Results are compared to selections obtained with one non-intrusive and two intrusive models. PHOBI outperforms the non-intrusive and one of the intrusive models in both noise conditions, achieving significantly higher correlations (r = 0.63 and 0.80). 4 SR scores are significantly lower (better) compared to the non-intrusive baseline (3.5% and 4.6% against 8.6% and 9.8%, respectively). The results in terms of 4 SR between PHOBI and the intrusive models are statistically not different, although PHOBI operates on the observed signal alone and does not require a clean reference signal.
引用
收藏
页数:10
相关论文
共 70 条
  • [1] Akeroyd M.A., 2020, J. Acoust. Soc. Am., V148, P2711, DOI [10.1121/1.5147514, DOI 10.1121/1.5147514]
  • [2] Refinement and validation of the binaural short time objective intelligibility measure for spatially diverse conditions
    Andersen, Asger Heidemann
    de Haan, Jan Mark
    Tan, Zheng-Hua
    Jensen, Jesper
    [J]. SPEECH COMMUNICATION, 2018, 102 : 1 - 13
  • [3] Andersen AH, 2017, INT CONF ACOUST SPEE, P5085, DOI 10.1109/ICASSP.2017.7953125
  • [4] Barker J, 2024, INT CONF ACOUST SPEE, P11551, DOI 10.1109/ICASSP48485.2024.10446441
  • [5] The 1st Clarity Prediction Challenge: A machine learning challenge for hearing aid intelligibility prediction
    Barker, Jon
    Akeroyd, Michael A.
    Cox, Trevor J.
    Culling, John F.
    Firth, Jennifer
    Graetzer, Simone
    Griffiths, Holly
    Harris, Lara
    Viveros-Munoz, Rhoddy
    Naylor, Graham
    Podwinska, Zuzanna
    Porter, Eszter
    [J]. INTERSPEECH 2022, 2022, : 3508 - 3512
  • [6] Baumgartel R.M., 2015, Trends Hear, V19
  • [7] BBC, BBC Sound Effects
  • [8] Standard Audiograms for the IEC 60118-15 Measurement Procedure
    Bisgaard, Nikolai
    Vlaming, Marcel S. M. G.
    Dahlquist, Martin
    [J]. TRENDS IN AMPLIFICATION, 2010, 14 (02): : 113 - 120
  • [9] A novel a priori SNR estimation approach based on selective cepstro-temporal smoothing
    Breithaupt, Colin
    Gerkmann, Timo
    Martin, Rainer
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4897 - 4900
  • [10] Bronkhorst AW, 2000, ACUSTICA, V86, P117