Informational masking of speech produced by speech-like sounds without linguistic content

被引:13
作者
Chen, Jing [1 ,2 ]
Li, Huahui [1 ,2 ]
Li, Liang [1 ,2 ]
Wu, Xihong [1 ,2 ]
Moore, Brian C. J. [3 ]
机构
[1] Peking Univ, Dept Machine Intelligence, Speech & Hearing Res Ctr, Beijing 100871, Peoples R China
[2] Peking Univ, Minist Educ, Key Lab Machine Percept, Beijing 100871, Peoples R China
[3] Univ Cambridge, Dept Expt Psychol, Cambridge CB2 3EB, England
基金
中国国家自然科学基金;
关键词
PERCEIVED SPATIAL SEPARATION; FUNDAMENTAL-FREQUENCY; COMPLEX TONES; MODULATION; INTELLIGIBILITY; PERCEPTION; RECOGNITION; RELEASE; DISCRIMINATION; IDENTIFICATION;
D O I
10.1121/1.3688510
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This study investigated whether speech-like maskers without linguistic content produce informational masking of speech. The target stimuli were nonsense Chinese Mandarin sentences. In experiment I, the masker contained harmonics the fundamental frequency (F0) of which was sinusoidally modulated and the mean F0 of which was varied. The magnitude of informational masking was evaluated by measuring the change in intelligibility (releasing effect) produced by inducing a perceived spatial separation of the target speech and masker via the precedence effect. The releasing effect was small and was only clear when the target and masker had the same mean F0, suggesting that informational masking was small. Performance with the harmonic maskers was better than with a steady speech-shaped noise (SSN) masker. In experiments II and III, the maskers were speech-like synthesized signals, alternating between segments with harmonic structure and segments composed of SSN. Performance was much worse than for experiment I, and worse than when an SSN masker was used, suggesting that substantial informational masking occurred. The similarity of the F0 contours of the target and masker had little effect. The informational masking effect was not influenced by whether or not the noise-like segments of the masker were synchronous with the unvoiced segments of the target speech. (C) 2012 Acoustical Society of America. [http://dx.doi.org/10.1121/1.3688510]
引用
收藏
页码:2914 / 2926
页数:13
相关论文
共 49 条
[11]   DISCRIMINATING BETWEEN COHERENT AND INCOHERENT FREQUENCY-MODULATION OF COMPLEX TONES [J].
CARLYON, RP .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1991, 89 (01) :329-340
[12]   THE ROLE OF FREQUENCY-MODULATION IN THE PERCEPTUAL SEGREGATION OF CONCURRENT VOWELS [J].
CULLING, JF ;
SUMMERFIELD, Q .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1995, 98 (02) :837-846
[13]   Effectiveness of spatial cues, prosody, and talker characteristics in selective attention [J].
Darwin, CJ ;
Hukin, RW .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2000, 107 (02) :970-977
[14]   PERCEPTUAL GROUPING OF SPEECH COMPONENTS DIFFERING IN FUNDAMENTAL-FREQUENCY AND ONSET-TIME [J].
DARWIN, CJ .
QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY SECTION A-HUMAN EXPERIMENTAL PSYCHOLOGY, 1981, 33 (MAY) :185-207
[15]   Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers [J].
Darwin, CJ ;
Brungart, DS ;
Simpson, BD .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2003, 114 (05) :2913-2922
[16]   PERCEIVING VOWELS IN THE PRESENCE OF ANOTHER SOUND - CONSTRAINTS ON FORMANT PERCEPTION [J].
DARWIN, CJ .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1984, 76 (06) :1636-1647
[17]   Simulations of cochlear implant hearing using filtered harmonic complexes: Implications for concurrent sound segregation [J].
Deeks, JM ;
Carlyon, RP .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2004, 115 (04) :1736-1746
[18]   TEMPORAL ENVELOPE AND FINE-STRUCTURE CUES FOR SPEECH-INTELLIGIBILITY [J].
DRULLMAN, R .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1995, 97 (01) :585-592
[19]   Note on informational masking (L) [J].
Durlach, NI ;
Mason, CR ;
Kidd, G ;
Arbogast, TL ;
Colburn, HS ;
Shinn-Cunningham, BG .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2003, 113 (06) :2984-2987
[20]   A spectro-temporal modulation index (STMI) for assessment of speech intelligibility [J].
Elhilali, M ;
Chi, T ;
Shamma, SA .
SPEECH COMMUNICATION, 2003, 41 (2-3) :331-348