Comparison of a short-time speech-based intelligibility metric to the speech transmission index and intelligibility data

被引:0
作者
机构
[1] Payton, Karen L.
[2] Shrestha, Mona
来源
Payton, K.L. (kpayton@umassd.edu) | 1600年 / Acoustical Society of America卷 / 134期
关键词
Several algorithms have been shown to generate a metric corresponding to the Speech Transmission Index (STI) using speech as a probe stimulus [e.g; Goldsworthy and Greenberg; J; Acoust; Soc; Am; 116; 3679-3689 (2004)]. The time-domain approaches work well on long speech segments and have the added potential to be used for short-time analysis. This study investigates the performance of the Envelope Regression (ER) time-domain STI method as a function of window length; in acoustically degraded environments with multiple talkers and speaking styles. The ER method is compared with a short-time Theoretical STI; derived from octave-band signal-to-noise ratios and reverberation times. For windows as short as 0.3 s; the ER method tracks short-time Theoretical STI changes in stationary speech-shaped noise; fluctuating restaurant babble and stationary noise plus reverberation. The metric is also compared to intelligibility scores on conversational speech and speech articulated clearly but at normal speaking rates (Clear/Norm) in stationary noise. Correlation between the metric and intelligibility scores is high and; consistent with the subject scores; the metrics are higher for Clear/Norm speech than for conversational speech and higher for the first word in a sentence than for the last word. © 2013 Acoustical Society of America;
D O I
暂无
中图分类号
学科分类号
摘要
Conference article (CA)
引用
收藏
相关论文
empty
未找到相关数据