Analysis of speech-based speech transmission index methods with implications for nonlinear operations

被引:163
作者
Goldsworthy, RL
Greenberg, JE
机构
[1] MIT, Elect Res Lab, Cambridge, MA 02139 USA
[2] Harvard Mit Div Hlth Sci & Technol, Cambridge, MA 02139 USA
关键词
D O I
10.1121/1.1804628
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The Speech Transmission Index (STI) is a physical. metric that is well correlated with the intelligibility of speech degraded by additive noise and reverberation. The traditional STI uses modulated noise as a probe signal and is valid for assessing degradations that result from linear operations on the speech signal. Researchers have attempted to extend the STI to predict the intelligibility of nonlinearly processed speech by proposing variations that use speech as a probe signal. This work considers four previously proposed speech-based STI methods and four novel methods, studied under conditions of additive noise, reverberation, and two nonlinear operations (envelope thresholding and spectral subtraction). Analyzing intermediate metrics in the STI calculation reveals why some methods fail for nonlinear operations. Results indicate that none of the previously proposed methods is adequate for all of the conditions considered, while four proposed methods produce qualitatively reasonable results and warrant further study. The discussion considers the relevance of this work to predicting the intelligibility of cochlear-implant processed speech. (C) 2004 Acoustical Society of America.
引用
收藏
页码:3679 / 3689
页数:11
相关论文
共 35 条
[21]   ENHANCEMENT AND BANDWIDTH COMPRESSION OF NOISY SPEECH [J].
LIM, JS ;
OPPENHEIM, AV .
PROCEEDINGS OF THE IEEE, 1979, 67 (12) :1586-1604
[22]   Mimicking the human ear [J].
Loizou, PC .
IEEE SIGNAL PROCESSING MAGAZINE, 1998, 15 (05) :101-130
[23]  
LUDVIGSEN C, 1993, SCAND AUDIOL, V22, P50
[24]   PREDICTION OF SPEECH-INTELLIGIBILITY FOR NORMAL-HEARING AND COCHLEARLY HEARING-IMPAIRED LISTENERS [J].
LUDVIGSEN, C .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1987, 82 (04) :1162-1171
[25]  
LUDVIGSEN C, 1990, ACTA OTO-LARYNGOL, P190
[26]  
PAPOULIS A, 1984, PROBABILITY RANDOM V, P263
[27]   INTELLIGIBILITY OF CONVERSATIONAL AND CLEAR SPEECH IN NOISE AND REVERBERATION FOR LISTENERS WITH NORMAL AND IMPAIRED HEARING [J].
PAYTON, KL ;
UCHANSKI, RM ;
BRAIDA, LD .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 95 (03) :1581-1592
[28]   A method to determine the speech transmission index from speech waveforms [J].
Payton, KL ;
Braida, LD .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1999, 106 (06) :3637-3648
[29]  
PAYTON KL, 2002, FUTURE SPEECH TRANSM, P125
[30]  
ROSS S, 1998, 1 COURSE PROBABILITY, P350