Optimised spectral weightings for noise-dependent speech intelligibility enhancement

被引:0
作者
Tang, Yan [1 ]
Cooke, Martin [1 ]
机构
[1] Univ Basque Country, Language & Speech Lab, Vitoria, Spain
来源
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 | 2012年
关键词
speech intelligibility; noise; optimisation; genetic algorithm; glimpse proportion;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural or synthetic speech is increasingly used in less-than-ideal listening conditions. Maximising the likelihood of correct message reception in such situations often leads to a strategy of loud and repetitive renditions of output speech. An alternative approach is to modify the speech signal in ways which increase intelligibility in noise without increasing signal level or duration. The current study focused on the design of stationary spectral modifications whose effect is to reallocate speech energy across frequency bands. Frequency band weights were selected using a genetic algorithm-based optimisation procedure, with glimpse proportion as the objective intelligibility metric, for a range of noise types and levels. As expected, a clear dependence of noise type and global signal-to-noise ratio on energy reallocation was found. One unanticipated outcome was the consistent discovery of sparse, highly-selective spectral energy weightings, particularly in high noise conditions. In a subjective test using stationary noise and competing speech maskers, listeners were able to identify significantly more words in sentences as a result of spectral weighting, with increases of up to 15 percentage points. These findings suggest that context-dependent speech output can be used to maintain intelligibility at lower sound output levels.
引用
收藏
页码:954 / 957
页数:4
相关论文
共 50 条
  • [41] Speech Intelligibility Based Enhancement System Using Modified Deep Neural Network and Adaptive Multi-band Spectral Subtraction
    Dash, Tusar Kanti
    Solanki, Sandeep Singh
    WIRELESS PERSONAL COMMUNICATIONS, 2020, 111 (02) : 1073 - 1087
  • [42] WHY DO SPEECH-ENHANCEMENT ALGORITHMS NOT IMPROVE SPEECH INTELLIGIBILITY?
    Kim, Gibak
    Loizou, Philipos C.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4738 - 4741
  • [43] Spectral Tilt Estimation for Speech Intelligibility Enhancement Using RNN Based on All-Pole Model
    Zhang, Rui
    Hu, Ruimin
    Li, Gang
    Wang, Xiaochen
    MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 144 - 156
  • [44] EFFECT OF INDIVIDUALLY TAILORED SPECTRAL CHANGE ENHANCEMENT ON SPEECH INTELLIGIBILITY AND QUALITY FOR HEARING-IMPAIRED LISTENERS
    Chen, Jing
    Moore, Brian C. J.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8643 - 8647
  • [45] Effect of spectral degradation on speech intelligibility and cortical representation
    Choi, Hyo Jung
    Kyong, Jeong-Sug
    Won, Jong Ho
    Shim, Hyun Joon
    FRONTIERS IN NEUROSCIENCE, 2024, 18
  • [46] Survey on Quality and Intelligibility offered by Speech Enhancement Algorithms
    Gaikwad, Vidyavati M.
    Vasekar, Shridevi S.
    1ST INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION ICCUBEA 2015, 2015, : 694 - 697
  • [47] SPEECH INTELLIGIBILITY ENHANCEMENT USING TUNABLE EQUALIZATION FILTER
    Chanda, Pinaki Shankar
    Park, Sungiin
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 613 - +
  • [48] Auditory efferents involved in speech-in-noise intelligibility
    Giraud, AL
    Garnier, S
    Micheyl, C
    Lina, G
    Chays, A
    CheryCroze, S
    NEUROREPORT, 1997, 8 (07) : 1779 - 1783
  • [49] EFFECTS OF HEARING PROTECTION ON SPEECH INTELLIGIBILITY IN NOISE.
    Bauman, Kathleen S.
    Marston, Larry E.
    S V Sound and Vibration, 1986, 20 (10): : 12 - 14
  • [50] Speech intelligibility in noise with fast compression hearing aids
    Verschuure, J
    Benning, FJ
    Van Cappellen, M
    Dreschler, WA
    Boeremans, PP
    AUDIOLOGY, 1998, 37 (03): : 127 - 150