Preservation of Speech Spectral Dynamics Enhances Intelligibility

被引:0
作者
Petkov, Petko N. [1 ]
Kleijn, W. Bastiaan [1 ,2 ]
机构
[1] KTH Royal Inst Technol, Sch Elect Engn, Stockholm, Sweden
[2] Victoria Univ Wellington, Sch Engn & Comp Sci, Wellington, New Zealand
来源
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年
关键词
speech intelligibility; spectral dynamics;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a method for the enhancement of intelligibility in scenarios where speech is rendered in a noisy environment. The method is based on the hypothesis that intelligibility is a monotonic function of the degree of preservation of the speech spectral dynamics. The accuracy of the speech spectral dynamics can then be traded against the power of the rendered speech signal. We can either maximize the dynamics accuracy given the signal power, or minimize the signal power given the dynamics accuracy. In our implementation, the spectral dynamics is quantified as the difference of the mel cepstra between time frames of the speech signal. We compared the speech rendered by our implementation against both natural speech and a reference method, for the scenario where signal power is minimized given a target dynamics accuracy, and observed a significantly improved intelligibility. The low system delay, and the low complexity and memory requirements make the new method particularly suitable for real-time applications.
引用
收藏
页码:3564 / 3568
页数:5
相关论文
共 50 条
  • [41] Testing the intelligibility of corrupted speech with an automated speech recognition system
    Hicks, WT
    Smolenski, BY
    Yantorno, RE
    Shaw, NE
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL IV, PROCEEDINGS: IMAGE, ACOUSTIC, SPEECH AND SIGNAL PROCESSING, 2003, : 383 - 387
  • [42] Speech Intelligibility and Quality: A Comparative Study of Speech Enhancement Algorithms
    Xu, Xiaodong
    Flynn, Ronan
    Russell, Michael
    2017 28TH IRISH SIGNALS AND SYSTEMS CONFERENCE (ISSC), 2017,
  • [43] Speech perception and speech intelligibility in children after cochlear implantation
    Calmels, MN
    Saliba, I
    Wanna, G
    Cochard, N
    Fillaux, J
    Deguine, O
    Fraysse, B
    INTERNATIONAL JOURNAL OF PEDIATRIC OTORHINOLARYNGOLOGY, 2004, 68 (03) : 347 - 351
  • [44] The speech intelligibility and applicability of the speech transmission index in large spaces
    Liu, Hongshan
    Ma, Hui
    Kang, Jian
    Wang, Chao
    APPLIED ACOUSTICS, 2020, 167 (167)
  • [45] Effects of simulated spectral holes on speech intelligibility and spatial release from masking under binaural and monaural listening
    Garadat, Soha N.
    Litovsky, Ruth Y.
    Yu, Gongqiang
    Zeng, Fan-Gang
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 127 (02) : 977 - 989
  • [46] Improving Low Pass Filtered Speech Intelligibility Using Nonlinear Frequency Compression with Cepstrum and Spectral Envelope Transformation
    Zaman, M. H. Mohd
    Mustafa, M. M.
    Hussain, A.
    5TH KUALA LUMPUR INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING 2011 (BIOMED 2011), 2011, 35 : 527 - +
  • [47] Increasing Speech Intelligibility via Spectral Shaping with Frequency Warping and Dynamic Range Compression plus Transient Enhancement
    Godoy, Elizabeth
    Stylianou, Yannis
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3539 - 3543
  • [48] The use of automatic speech recognition showing the influence of nasality on speech intelligibility
    S. Mayr
    K. Burkhardt
    M. Schuster
    K. Rogler
    A. Maier
    H. Iro
    European Archives of Oto-Rhino-Laryngology, 2010, 267 : 1719 - 1725
  • [49] On Detectable and Meaningful Speech-Intelligibility Benefits
    Whitmer, William M.
    McShefferty, David
    Akeroyd, Michael A.
    PHYSIOLOGY, PSYCHOACOUSTICS AND COGNITION IN NORMAL AND IMPAIRED HEARING, 2016, 894 : 447 - 455
  • [50] Effects of hearing protector devices on speech intelligibility
    Fernandes, JC
    APPLIED ACOUSTICS, 2003, 64 (06) : 581 - 590