Prediction of speech intelligibility based on an auditory preprocessing model

被引:48
作者
Christiansen, Claus [1 ]
Pedersen, Michael Syskind [2 ]
Dau, Torsten [1 ]
机构
[1] Tech Univ Denmark, Dept Elect Engn, Ctr Appl Hearing Res, DK-2800 Lyngby, Denmark
[2] Oticon AS, DK-2765 Smorum, Denmark
关键词
Speech intelligibility; Auditory processing model; Ideal binary mask; Speech intelligibility index; Speech transmission index; SHORT-TERM ADAPTATION; RECEPTION THRESHOLD; TRANSMISSION INDEX; QUALITY ASSESSMENT; FLUCTUATING NOISE; ITU STANDARD; NERVE; MODULATION; MASKING; SENTENCES;
D O I
10.1016/j.specom.2010.03.004
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Classical speech intelligibility models, such as the speech transmission index (STI) and the speech intelligibility index (SII) are based on calculations on the physical acoustic signals. The present study predicts speech intelligibility by combining a psychoacoustically validated model of auditory preprocessing [Dau et al., 1997. J. Acoust. Soc. Am. 102,2892-2905] with a simple central stage that describes the similarity of the test signal with the corresponding reference signal at a level of the internal representation of the signals. The model was compared with previous approaches, whereby a speech in noise experiment was used for training and an ideal binary mask experiment was used for evaluation. All three models were able to capture the trends in the speech in noise training data well, but the proposed model provides a better prediction of the binary mask test data, particularly when the binary masks degenerate to a noise vocoder. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:678 / 692
页数:15
相关论文
共 53 条