Preservation of Speech Spectral Dynamics Enhances Intelligibility

被引:0
作者
Petkov, Petko N. [1 ]
Kleijn, W. Bastiaan [1 ,2 ]
机构
[1] KTH Royal Inst Technol, Sch Elect Engn, Stockholm, Sweden
[2] Victoria Univ Wellington, Sch Engn & Comp Sci, Wellington, New Zealand
来源
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年
关键词
speech intelligibility; spectral dynamics;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a method for the enhancement of intelligibility in scenarios where speech is rendered in a noisy environment. The method is based on the hypothesis that intelligibility is a monotonic function of the degree of preservation of the speech spectral dynamics. The accuracy of the speech spectral dynamics can then be traded against the power of the rendered speech signal. We can either maximize the dynamics accuracy given the signal power, or minimize the signal power given the dynamics accuracy. In our implementation, the spectral dynamics is quantified as the difference of the mel cepstra between time frames of the speech signal. We compared the speech rendered by our implementation against both natural speech and a reference method, for the scenario where signal power is minimized given a target dynamics accuracy, and observed a significantly improved intelligibility. The low system delay, and the low complexity and memory requirements make the new method particularly suitable for real-time applications.
引用
收藏
页码:3564 / 3568
页数:5
相关论文
共 50 条
  • [1] Improvement of speech intelligibility by reallocation of spectral energy
    Takou, Reiko
    Seiyama, Nobumasa
    Imai, Atsushi
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3572 - 3574
  • [2] Effect of spectral degradation on speech intelligibility and cortical representation
    Choi, Hyo Jung
    Kyong, Jeong-Sug
    Won, Jong Ho
    Shim, Hyun Joon
    FRONTIERS IN NEUROSCIENCE, 2024, 18
  • [3] Effect of spectral resolution on the intelligibility of ideal binary masked speech
    Li, Ning
    Loizou, Philipos C.
    Journal of the Acoustical Society of America, 2008, 123 (04):
  • [4] Causal cortical dynamics of a predictive enhancement of speech intelligibility
    Di Liberto, Giovanni M.
    Lalor, Edmund C.
    Millman, Rebecca E.
    NEUROIMAGE, 2018, 166 : 247 - 258
  • [5] Spectral and temporal manipulations of SFF envelopes for enhancement of speech intelligibility in noise
    Chennupati, Nivedita
    Kadiri, Sudarsana Reddy
    Yegnanarayana, B.
    COMPUTER SPEECH AND LANGUAGE, 2019, 54 : 86 - 105
  • [6] Optimised spectral weightings for noise-dependent speech intelligibility enhancement
    Tang, Yan
    Cooke, Martin
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 954 - 957
  • [7] Effects on speech intelligibility of temporal jittering and spectral smearing of the high-frequency components of speech
    MacDonald, Ewen N.
    Pichora-Fuller, M. Kathleen
    Schneider, Bruce A.
    HEARING RESEARCH, 2010, 261 (1-2) : 63 - 66
  • [8] Effects of spectral and temporal modulation degradation on intelligibility and cortical tracking of speech signals
    De Palma, Ignacio Calderon
    Lopez, Laura S.
    Valdes, Alejandro Lopez
    INTERSPEECH 2023, 2023, : 5192 - 5196
  • [9] The intelligibility of pointillistic speech
    Kidd, Gerald
    Streeter, Timothy M.
    Ihlefeld, Antje
    Maddox, Ross K.
    Mason, Christine R.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 126 (06) : EL196 - EL201
  • [10] Gender difference in speech intelligibility using speech intelligibility tests and acoustic analyses
    Kwon, Ho-Beom
    JOURNAL OF ADVANCED PROSTHODONTICS, 2010, 2 (03) : 71 - 76