Several algorithms have been shown to generate a metric corresponding to the Speech Transmission Index (STI) using speech as a probe stimulus [e.g;
Goldsworthy and Greenberg;
J;
Acoust;
Soc;
Am;
116;
3679-3689 (2004)]. The time-domain approaches work well on long speech segments and have the added potential to be used for short-time analysis. This study investigates the performance of the Envelope Regression (ER) time-domain STI method as a function of window length;
in acoustically degraded environments with multiple talkers and speaking styles. The ER method is compared with a short-time Theoretical STI;
derived from octave-band signal-to-noise ratios and reverberation times. For windows as short as 0.3 s;
the ER method tracks short-time Theoretical STI changes in stationary speech-shaped noise;
fluctuating restaurant babble and stationary noise plus reverberation. The metric is also compared to intelligibility scores on conversational speech and speech articulated clearly but at normal speaking rates (Clear/Norm) in stationary noise. Correlation between the metric and intelligibility scores is high and;
consistent with the subject scores;
the metrics are higher for Clear/Norm speech than for conversational speech and higher for the first word in a sentence than for the last word. © 2013 Acoustical Society of America;