SPEECH FOUNDATION MODELS ON INTELLIGIBILITY PREDICTION FOR HEARING-IMPAIRED LISTENERS

被引:0
|
作者
Cuervo, Santiago [1 ]
Marxer, Ricard [1 ]
机构
[1] Aix Marseille Univ, Univ Toulon, CNRS, LIS, Marseille, France
关键词
Foundation models; speech perception; intelligibility prediction; hearing aids;
D O I
10.1109/ICASSP48485.2024.10447907
中图分类号
学科分类号
摘要
Speech foundation models (SFMs) have been benchmarked on many speech processing tasks, often achieving state-of-the-art performance with minimal adaptation. However, the SFM paradigm has been significantly less explored for applications of interest to the speech perception community. In this paper we present a systematic evaluation of 10 SFMs on one such application: Speech intelligibility prediction. We focus on the non-intrusive setup of the Clarity Prediction Challenge 2 (CPC2), where the task is to predict the percentage of words correctly perceived by hearing-impaired listeners from speech-in-noise recordings. We propose a simple method that learns a lightweight specialized prediction head on top of frozen SFMs to approach the problem. Our results reveal statistically significant differences in performance across SFMs. Our method resulted in the winning submission in the CPC2, demonstrating its promise for speech perception applications.
引用
收藏
页码:1421 / 1425
页数:5
相关论文
共 50 条
  • [31] Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises
    Chen, Jitong
    Wang, Yuxuan
    Yoho, Sarah E.
    Wang, DeLiang
    Healy, Eric W.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2016, 139 (05): : 2604 - 2612
  • [32] Optimum Reverberation for Speech Intelligibility for Normal and Hearing-Impaired Listeners in Realistic Classrooms Using Auralization
    Yang, Wonyoung
    Hodgson, Murray
    BUILDING ACOUSTICS, 2007, 14 (03) : 163 - 177
  • [33] EFFECT OF INDIVIDUALLY TAILORED SPECTRAL CHANGE ENHANCEMENT ON SPEECH INTELLIGIBILITY AND QUALITY FOR HEARING-IMPAIRED LISTENERS
    Chen, Jing
    Moore, Brian C. J.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8643 - 8647
  • [34] Effects of degradation of intensity, time, or frequency content on speech intelligibility for normal-hearing and hearing-impaired listeners
    van Schijndel, NH
    Houtgast, T
    Festen, JM
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 110 (01): : 529 - 542
  • [35] Auditory and auditory-visual intelligibility of speech in fluctuating maskers for normal-hearing and hearing-impaired listeners
    Bernstein, Joshua G. W.
    Grant, Ken W.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 125 (05): : 3358 - 3372
  • [36] Prediction of speech recognition from audibility and psychoacoustic abilities of hearing-impaired listeners
    Ching, T
    Dillon, H
    Byrne, D
    MODELING SENSORINEURAL HEARING LOSS, 1997, : 433 - 445
  • [37] RECOGNITION OF SYNTHETIC SPEECH BY HEARING-IMPAIRED ELDERLY LISTENERS
    HUMES, LE
    NELSON, KJ
    PISONI, DB
    JOURNAL OF SPEECH AND HEARING RESEARCH, 1991, 34 (05): : 1180 - 1184
  • [38] ON ENHANCEMENT OF SPECTRAL CONTRAST IN SPEECH FOR HEARING-IMPAIRED LISTENERS
    BUNNELL, HT
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1990, 88 (06): : 2546 - 2556
  • [39] THE ORAL SPEECH-INTELLIGIBILITY OF HEARING-IMPAIRED TALKERS
    MONSEN, RB
    JOURNAL OF SPEECH AND HEARING DISORDERS, 1983, 48 (03): : 286 - 296
  • [40] Intelligibility of modified speech for young listeners with normal and impaired hearing
    Uchanski, RM
    Geers, AE
    Protopapas, A
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2002, 45 (05): : 1027 - 1038