Increasing Speech Intelligibility via Spectral Shaping with Frequency Warping and Dynamic Range Compression plus Transient Enhancement

被引:0
作者
Godoy, Elizabeth [1 ]
Stylianou, Yannis [1 ]
机构
[1] Fdn Res & Technol Hellas, Inst Comp Sci, Iraklion, Greece
来源
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年
关键词
speech intelligibility; spectral shaping; frequency warping; dynamic range compression; HARD-OF-HEARING; CLEAR; PERCEPTION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to make speech (natural or synthetic) more intelligible for listeners in real-world noisy environments, various modifications have been proposed that exploit spectral and temporal signal features. Previously, an evaluation campaign involving several approaches illustrated that a Spectral Shaping (SS) and Dynamic Range Compression (DRC) method proved highly successful at increasing speech intelligibility. For the public follow-up campaign (i.e., the Hurricane Challenge), this work introduces additional modifications into SSDRC in an attempt to further enhance intelligibility. First aiming to slow down the articulation rate, the speech is uniformly time stretched to effectively increase signal redundancy. Second, a frequency warping mechanism to expand vowel space is incorporated into the SS. Third, scaling to enhance the transient regions of speech is applied in the time-domain along with DRC. Objective and extensive subjective (i.e., the Hurricane Challenge) evaluations show that the new approach successfully achieves intelligibility gains over natural speech for all of the noise conditions evaluated, though compared to SSDRC, there is less advantage observed at higher SNR.
引用
收藏
页码:3539 / 3543
页数:5
相关论文
共 11 条
  • [1] Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compression
    Zorila, Tudor-Catalin
    Kandia, Varvara
    Stylianou, Yannis
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 634 - 637
  • [2] Assessing the Intelligibility Impact of Vowel Space Expansion via Clear Speech-Inspired Frequency Warping
    Godoy, E.
    Koutsogiannaki, M.
    Stylianou, Y.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1168 - 1172
  • [3] Multichannel dynamic-range compression using digital frequency warping
    Kates, JM
    Arehart, KH
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2005, 2005 (18) : 3003 - 3014
  • [4] Multichannel Dynamic-Range Compression Using Digital Frequency Warping
    James M. Kates
    Kathryn Hoberg Arehart
    EURASIP Journal on Advances in Signal Processing, 2005
  • [5] Improving speech intelligibility in noise by SII-dependent preprocessing using frequency-dependent amplification and dynamic range compression
    Schepker, Henning
    Rennies, Jan
    Doclo, Simon
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3544 - 3548
  • [6] SPEECH INTELLIGIBILITY ENHANCEMENT USING NON-PARALLEL SPEAKING STYLE CONVERSION WITH STARGAN AND DYNAMIC RANGE COMPRESSION
    Li, Gang
    Hu, Ruimin
    Ke, Shanfa
    Zhang, Rui
    Wang, Xiaochen
    Gao, Li
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [7] SPEECH-IN-NOISE INTELLIGIBILITY IMPROVEMENT BASED ON POWER RECOVERY AND DYNAMIC RANGE COMPRESSION
    Zorila, Tudor-Catalin
    Kandia, Varvara
    Stylianou, Yannis
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2075 - 2079
  • [8] Intelligibility and Clarity of Reverberant Speech: Effects of Wide Dynamic Range Compression Release Time and Working Memory
    Reinhart, Paul N.
    Souza, Pamela E.
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2016, 59 (06): : 1543 - 1554
  • [9] Improving Low Pass Filtered Speech Intelligibility Using Nonlinear Frequency Compression with Cepstrum and Spectral Envelope Transformation
    Zaman, M. H. Mohd
    Mustafa, M. M.
    Hussain, A.
    5TH KUALA LUMPUR INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING 2011 (BIOMED 2011), 2011, 35 : 527 - +
  • [10] Effect of slow-acting wide dynamic range compression on measures of intelligibility and ratings of speech quality in simulated-loss listeners
    Rosengard, PS
    Payton, KL
    Braida, LD
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2005, 48 (03): : 702 - 714