Combining spectral and temporal modification techniques for speech intelligibility enhancement

被引:6
作者
Cooke, Martin [1 ,2 ]
Aubanel, Vincent [3 ]
Garcia Lecumberri, Maria Luisa [2 ]
机构
[1] Ikerbasque Basque Sci Fdn, Bilbao, Spain
[2] Univ Basque Country, Language & Speech Lab, Vitoria 01006, Spain
[3] Univ Grenoble Alpes, Ctr Natl Rech Sci, GIPSA Lab, Grenoble, France
关键词
Speech modification; Intelligibility; Retiming; Glimpsing; COCHLEA-SCALED ENTROPY; NOISE; CLEAR; INTENSITY;
D O I
10.1016/j.csl.2018.10.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modifying clean speech prior to output in noisy conditions can lead to substantial intelligibility gains. Most algorithms operate by redistributing energy across the signal, leaving the timing of the underlying speech sounds intact. Other techniques do alter the timing of speech relative to the masker. Both classes of approach - spectral and temporal - lead to a reduction in energetic masking. The current study examines how their combination affects intelligibility. Arguments can be made for both synergy and redundancy, and the presence of distortions introduced by both spectral and temporal approaches might even lead to an antagonistic combination. A cohort of native Spanish listeners identified keywords in sentences in unmodified form and following spectral, temporal and spectro-temporal modification, in the presence of a fluctuating masker. Errors in the spectro-temporal condition were substantially lower than following spectral or temporal modification alone, with a three-fold reduction compared to unmodified speech. Spectro-temporal gains were observed for all phonemes. A glimpse-based model of energetic masking incorporating speech rate changes predicts intelligibility (r = .96), and a glimpsing analysis provides further insights into the distinct mechanisms through which spectral and temporal approaches lead to a release from energetic masking. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:26 / 39
页数:14
相关论文
共 50 条
  • [21] The contribution of durational and spectral changes to the Lombard speech intelligibility benefit
    Cooke, Martin
    Mayo, Catherine
    Villegas, Julian
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2014, 135 (02) : 874 - 883
  • [22] A Simple Model of Speech Communication and its Application to Intelligibility Enhancement
    Kleijn, W. Bastiaan
    Hendriks, R. C.
    IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (03) : 303 - 307
  • [23] Speech intelligibility changes the temporal evolution of neural speech tracking
    Chen, Ya-Ping
    Schmidt, Fabian
    Keitel, Anne
    Roesch, Sebastian
    Hauswald, Anne
    Weisz, Nathan
    NEUROIMAGE, 2023, 268
  • [24] Improvement of speech intelligibility by reallocation of spectral energy
    Takou, Reiko
    Seiyama, Nobumasa
    Imai, Atsushi
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3572 - 3574
  • [25] Rephrasing-Based Speech Intelligibility Enhancement
    Zhang, Mengqiu
    Petkov, Petko N.
    Kleijn, W. Bastiaan
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3554 - 3558
  • [26] Speech intelligibility enhancement: a hybrid wiener approach
    Srinivasarao, V.
    Ghanekar, Umesh
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (03) : 517 - 525
  • [27] COMPARISON OF POST-FILTERING METHODS FOR INTELLIGIBILITY ENHANCEMENT OF TELEPHONE SPEECH
    Jokinen, Emma
    Alku, Paavo
    Vainio, Martti
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2333 - 2337
  • [28] Combining the evidences of temporal and spectral enhancement techniques for improving the performance of Indian language identification system in the presence of background noise
    Polasi P.K.
    Sri Rama Krishna K.
    International Journal of Speech Technology, 2016, 19 (1) : 75 - 85
  • [29] Segment Specific Enhancement of Speech Characteristics for Improving Speech Intelligibility Under Adverse Listening Conditions
    Vanukuru, Praveen Kumar
    Jayan, A. R.
    2015 INTERNATIONAL CONFERENCE ON CONTROL COMMUNICATION & COMPUTING INDIA (ICCC), 2015, : 501 - 504
  • [30] Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems
    Kolbaek, Morten
    Tan, Zheng-Hua
    Jensen, Jesper
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) : 153 - 167