Optimal Near-End Speech Intelligibility Improvement Incorporating Additive Noise and Late Reverberation Under an Approximation of the Short-Time SII

被引:28
作者
Hendriks, Richard C. [1 ]
Crespo, Joao B. [1 ]
Jensen, Jesper [2 ,3 ]
Taal, Cees H. [4 ]
机构
[1] Delft Univ Technol, Signal & Informat Proc Lab, NL-2628 CD Delft, Netherlands
[2] Oticon AS, DK-2765 Smorum, Denmark
[3] Aalborg Univ, Dept Elect Syst, DK-9220 Aalborg, Denmark
[4] Philips Res, Appl Sensor Technol, NL-5656 AE Eindhoven, Netherlands
关键词
Additive noise; approximated speech intelligibility index (SII); late reverberation; speech intelligibility; ENHANCEMENT; SENTENCES; MODEL;
D O I
10.1109/TASLP.2015.2409780
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The presence of environmental additive noise in the vicinity of the user typically degrades the speech intelligibility of speech processing applications. This intelligibility loss can be compensated by properly preprocessing the speech signal prior to play-out, often referred to as near-end speech enhancement. Although the majority of such algorithms focus primarily on the presence of additive noise, reverberation can also severely degrade intelligibility. In this paper we investigate how late reverberation and additive noise can be jointly taken into account in the near-end speech enhancement process. For this effort we use a recently presented approximation of the speech intelligibility index under a power constraint, which we optimize for speech degraded by both additive noise and late reverberation. The algorithm results in time-frequency dependent amplification factors that depend on both the additive noise power spectral density as well as the late reverberation energy. These amplification factors redistribute speech energy across frequency and perform a dynamic range compression. Experimental results using both instrumental intelligibility measures as well as intelligibility listening tests show that the proposed approach improves speech intelligibility over state-of-the-art reference methods when speech signals are degraded simultaneously by additive noise and reverberation. Speech intelligibility improvements in the order of 20% are observed.
引用
收藏
页码:851 / 862
页数:12
相关论文
共 48 条
[1]  
[Anonymous], THESIS U MAINE LE MA
[2]  
[Anonymous], ITG FACHT SPRACHK
[3]  
[Anonymous], 2007, HDB NOISE VIBRATION, DOI DOI 10.1002/9780470209707.CH57
[4]  
[Anonymous], 2001, Microphone Arrays. Signal Processing Techniques and Applications, DOI DOI 10.1007/978-3-662-04619-7
[5]  
[Anonymous], P EURASIP EUR SIGN P
[6]  
[Anonymous], INT J AUDIOL EARLY O
[7]  
[Anonymous], P ISCA INT
[8]  
[Anonymous], P ISCA INT
[9]  
[Anonymous], 2013, Speech Enhancement: Theory and Practice
[10]  
[Anonymous], 1988, NAT I STANDARDS THEC