Analysis of speech-based speech transmission index methods with implications for nonlinear operations

被引:160
作者
Goldsworthy, RL
Greenberg, JE
机构
[1] MIT, Elect Res Lab, Cambridge, MA 02139 USA
[2] Harvard Mit Div Hlth Sci & Technol, Cambridge, MA 02139 USA
关键词
D O I
10.1121/1.1804628
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The Speech Transmission Index (STI) is a physical. metric that is well correlated with the intelligibility of speech degraded by additive noise and reverberation. The traditional STI uses modulated noise as a probe signal and is valid for assessing degradations that result from linear operations on the speech signal. Researchers have attempted to extend the STI to predict the intelligibility of nonlinearly processed speech by proposing variations that use speech as a probe signal. This work considers four previously proposed speech-based STI methods and four novel methods, studied under conditions of additive noise, reverberation, and two nonlinear operations (envelope thresholding and spectral subtraction). Analyzing intermediate metrics in the STI calculation reveals why some methods fail for nonlinear operations. Results indicate that none of the previously proposed methods is adequate for all of the conditions considered, while four proposed methods produce qualitatively reasonable results and warrant further study. The discussion considers the relevance of this work to predicting the intelligibility of cochlear-implant processed speech. (C) 2004 Acoustical Society of America.
引用
收藏
页码:3679 / 3689
页数:11
相关论文
共 35 条
  • [1] IMAGE METHOD FOR EFFICIENTLY SIMULATING SMALL-ROOM ACOUSTICS
    ALLEN, JB
    BERKLEY, DA
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 (04) : 943 - 950
  • [2] The normalized correlation: Accounting for binaural detection across center frequency
    Bernstein, LR
    Trahiotis, C
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (06) : 3774 - 3784
  • [3] Spectro-temporal modulation transfer functions and speech intelligibility
    Chi, TS
    Gao, YJ
    Guyton, MC
    Ru, PW
    Shamma, S
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1999, 106 (05) : 2719 - 2732
  • [4] TEMPORAL ENVELOPE AND FINE-STRUCTURE CUES FOR SPEECH-INTELLIGIBILITY
    DRULLMAN, R
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1995, 97 (01) : 585 - 592
  • [5] EFFECT OF REDUCING SLOW TEMPORAL MODULATIONS ON SPEECH RECEPTION
    DRULLMAN, R
    FESTEN, JM
    PLOMP, R
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 95 (05) : 2670 - 2680
  • [6] EFFECT OF TEMPORAL ENVELOPE SMEARING ON SPEECH RECEPTION
    DRULLMAN, R
    FESTEN, JM
    PLOMP, R
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 95 (02) : 1053 - 1064
  • [7] A spectro-temporal modulation index (STMI) for assessment of speech intelligibility
    Elhilali, M
    Chi, T
    Shamma, SA
    [J]. SPEECH COMMUNICATION, 2003, 41 (2-3) : 331 - 348
  • [8] FACTORS GOVERNING THE INTELLIGIBILITY OF SPEECH SOUNDS
    FRENCH, NR
    STEINBERG, JC
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1947, 19 (01) : 90 - 119
  • [9] Postprocessing method for suppressing musical noise generated by spectral subtraction
    Goh, Z
    Tan, KC
    Tan, BTG
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (03): : 287 - 292
  • [10] EFFECTS OF NOISE AND NOISE SUPPRESSION ON SPEECH-PERCEPTION BY COCHLEAR IMPLANT USERS
    HOCHBERG, I
    BOOTHROYD, A
    WEISS, M
    HELLMAN, S
    [J]. EAR AND HEARING, 1992, 13 (04) : 263 - 271