Predicting the Intelligibility of Noisy and Nonlinearly Processed Binaural Speech

被引:36
作者
Andersen, Asger Heidemann [1 ,2 ]
de Haan, Jan Mark [1 ]
Tan, Zheng-Hua [2 ]
Jensen, Jesper [1 ,2 ]
机构
[1] Oticon AS, DK-2765 Smorum, Denmark
[2] Aalborg Univ, Dept Elect Syst, DK-9220 Aalborg, Denmark
关键词
Binaural speech intelligibility prediction; binaural advantage; speech enhancement; speech transmission; RECEPTION THRESHOLD; TRANSMISSION INDEX; OBJECTIVE MEASURES; NORMAL-HEARING; REVERBERATION; EQUALIZATION; COMPRESSION; MODEL; TIME;
D O I
10.1109/TASLP.2016.2588002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Objective speech intelligibility measures are gaining popularity in the development of speech enhancement algorithms and speech processing devices such as hearing aids. Such devices may process the input signals nonlinearly and modify the binaural cues presented to the user. We propose a method for predicting the intelligibility of noisy and nonlinearly processed binaural speech. This prediction is based on the noisy and processed signal as well as a clean speech reference signal. The method is obtained by extending a modified version of the short-time objective intelligibility (STOI) measure with a modified equalization-cancellation (EC) stage. We evaluate the performance of the method by comparing the predictions with measured intelligibility from four listening experiments. These comparisons indicate that the proposed measure can provide accurate predictions of 1) the intelligibility of diotic speech with an accuracy similar to that of the original STOI measure, 2) speech reception thresholds (SRTs) in conditions with a frontal target speaker and a single interferer in the horizontal plane, 3) SRTs in conditions with a frontal target and a single interferer when ideal time frequency segregation (ITFS) is applied to the left and right ears separately, and 4) the advantage of two-microphone beamforming as applied in state-of-the-art hearing aids. A MATLAB implementation of the proposed measure is available online.(1)
引用
收藏
页码:1908 / 1920
页数:13
相关论文
共 57 条
[1]   The CIPICHRTF database [J].
Algazi, VR ;
Duda, RO ;
Thompson, DM ;
Avendano, C .
PROCEEDINGS OF THE 2001 IEEE WORKSHOP ON THE APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2001, :99-102
[2]  
Allen JB, 2005, AUDITORY SIGNAL PROCESSINGP: PHYSIOLOGY, PSYCHOACOUSTICS, AND MODELS, P314
[3]  
Andersen AH, 2015, 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, P2563
[4]  
Andersen AH, 2016, INT CONF ACOUST SPEE, P4995, DOI 10.1109/ICASSP.2016.7472628
[5]  
[Anonymous], 2011, 60268162011 IEC
[6]  
[Anonymous], 1997, S351997 ANSI
[7]   Prediction of speech intelligibility in spatial noise and reverberation for normal-hearing and hearing-impaired listeners [J].
Beutelmann, Rainer ;
Brand, Thomas .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2006, 120 (01) :331-342
[8]   Revision, extension, and evaluation of a binaural speech intelligibility model [J].
Beutelmann, Rainer ;
Brand, Thomas ;
Kollmeier, Birger .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 127 (04) :2479-2497
[9]  
Boldt J.B., 2009, P 17 EUR SIGN PROC C
[10]   Efficient adaptive procedures for threshold and concurrent slope estimates for psychophysics and speech intelligibility tests [J].
Brand, T ;
Kollmeier, B .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 111 (06) :2801-2810