Informational masking of speech produced by speech-like sounds without linguistic content

被引:13
作者
Chen, Jing [1 ,2 ]
Li, Huahui [1 ,2 ]
Li, Liang [1 ,2 ]
Wu, Xihong [1 ,2 ]
Moore, Brian C. J. [3 ]
机构
[1] Peking Univ, Dept Machine Intelligence, Speech & Hearing Res Ctr, Beijing 100871, Peoples R China
[2] Peking Univ, Minist Educ, Key Lab Machine Percept, Beijing 100871, Peoples R China
[3] Univ Cambridge, Dept Expt Psychol, Cambridge CB2 3EB, England
基金
中国国家自然科学基金;
关键词
PERCEIVED SPATIAL SEPARATION; FUNDAMENTAL-FREQUENCY; COMPLEX TONES; MODULATION; INTELLIGIBILITY; PERCEPTION; RECOGNITION; RELEASE; DISCRIMINATION; IDENTIFICATION;
D O I
10.1121/1.3688510
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This study investigated whether speech-like maskers without linguistic content produce informational masking of speech. The target stimuli were nonsense Chinese Mandarin sentences. In experiment I, the masker contained harmonics the fundamental frequency (F0) of which was sinusoidally modulated and the mean F0 of which was varied. The magnitude of informational masking was evaluated by measuring the change in intelligibility (releasing effect) produced by inducing a perceived spatial separation of the target speech and masker via the precedence effect. The releasing effect was small and was only clear when the target and masker had the same mean F0, suggesting that informational masking was small. Performance with the harmonic maskers was better than with a steady speech-shaped noise (SSN) masker. In experiments II and III, the maskers were speech-like synthesized signals, alternating between segments with harmonic structure and segments composed of SSN. Performance was much worse than for experiment I, and worse than when an SSN masker was used, suggesting that substantial informational masking occurred. The similarity of the F0 contours of the target and masker had little effect. The informational masking effect was not influenced by whether or not the noise-like segments of the masker were synchronous with the unvoiced segments of the target speech. (C) 2012 Acoustical Society of America. [http://dx.doi.org/10.1121/1.3688510]
引用
收藏
页码:2914 / 2926
页数:13
相关论文
共 49 条
[1]  
[Anonymous], 2006, SNACK SOUND TOOLKIT
[2]  
ANSI, 1997, S35 ANSI AC SOC AM
[3]   MODELING THE PERCEPTION OF CONCURRENT VOWELS - VOWELS WITH THE SAME FUNDAMENTAL-FREQUENCY [J].
ASSMANN, PF ;
SUMMERFIELD, Q .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1989, 85 (01) :327-338
[4]   EFFECTS OF SPECTRAL SMEARING ON THE INTELLIGIBILITY OF SENTENCES IN THE PRESENCE OF INTERFERING SPEECH [J].
BAER, T ;
MOORE, BCJ .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 95 (04) :2277-2280
[5]   Sound field separation technique based on equivalent source method and its application in nearfield acoustic holography [J].
Bi, Chuan-Xing ;
Chen, Xin-Zhao ;
Chen, Jian .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 123 (03) :1472-1478
[6]   The role of fundamental frequency contours in the perception of speech against interfering speech [J].
Binns, Christine ;
Culling, John F. .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2007, 122 (03) :1765-1776
[7]  
Bird J., 1998, PSYCHOPHYSICAL PHYSL, P263
[8]  
Bregman A. S., 1990, AUDITORY SCENE ANAL, P38
[9]   INTONATION AND THE PERCEPTUAL SEPARATION OF SIMULTANEOUS VOICES [J].
BROKX, JPL ;
NOOTEBOOM, SG .
JOURNAL OF PHONETICS, 1982, 10 (01) :23-36
[10]   Informational and energetic masking effects in the perception of multiple simultaneous talkers [J].
Brungart, DS ;
Simpson, BD ;
Ericson, MA ;
Scott, KR .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 110 (05) :2527-2538