Speaker-normalized sound representations in the human auditory cortex

被引:41
作者
Sjerps, Matthias J. [1 ,2 ]
Fox, Neal P. [3 ]
Johnson, Keith [4 ]
Chang, Edward F. [3 ,5 ]
机构
[1] Radboud Univ Nijmegen, Donders Inst Brain Cognit & Behav, Ctr Cognit Neuroimaging, Kapittelweg 29, NL-6525 EN Nijmegen, Netherlands
[2] Max Planck Inst Psycholinguist, Wundtlaan 1, NL-6525 XD Nijmegen, Netherlands
[3] Univ Calif San Francisco, Dept Neurol Surg, 675 Nelson Rising Lane, San Francisco, CA 94158 USA
[4] Univ Calif Berkeley, Dept Linguist, 1203 Dwinelle Hall 2650, Berkeley, CA 94720 USA
[5] Univ Calif San Francisco, Weill Inst Neurosci, 675 Nelson Rising Lane, San Francisco, CA 94158 USA
关键词
RELIABLE SPECTRAL PROPERTIES; NONSPEECH CONTEXT; TEMPORAL-LOBE; SPEECH; VOICE; COMPENSATION; PERCEPTION; RESPONSES; ADAPTATION; MASKING;
D O I
10.1038/s41467-019-10365-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The acoustic dimensions that distinguish speech sounds (like the vowel differences in "boot" and "boat") also differentiate speakers' voices. Therefore, listeners must normalize across speakers without losing linguistic information. Past behavioral work suggests an important role for auditory contrast enhancement in normalization: preceding context affects listeners' perception of subsequent speech sounds. Here, using intracranial electrocorticography in humans, we investigate whether and how such context effects arise in auditory cortex. Participants identified speech sounds that were preceded by phrases from two different speakers whose voices differed along the same acoustic dimension as target words (the lowest resonance of the vocal tract). In every participant, target vowels evoke a speaker-dependent neural response that is consistent with the listener's perception, and which follows from a contrast enhancement model. Auditory cortex processing thus displays a critical feature of normalization, allowing listeners to extract meaningful content from the voices of diverse speakers.
引用
收藏
页数:9
相关论文
共 76 条
[41]   Central locus for nonspeech context effects on phonetic identification (L) [J].
Lotto, AJ ;
Sullivan, SC ;
Holt, LL .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2003, 113 (01) :53-56
[42]   Nonparametric statistical testing of EEG- and MEG-data [J].
Maris, Eric ;
Oostenveld, Robert .
JOURNAL OF NEUROSCIENCE METHODS, 2007, 164 (01) :177-190
[43]   Phonetic Feature Encoding in Human Superior Temporal Gyrus [J].
Mesgarani, Nima ;
Cheung, Connie ;
Johnson, Keith ;
Chang, Edward F. .
SCIENCE, 2014, 343 (6174) :1006-1010
[44]   Auditory sensitivity to formant ratios: Toward an account of vowel normalisation [J].
Monahan, Philip J. ;
Idsardi, William J. .
LANGUAGE AND COGNITIVE PROCESSES, 2010, 25 (06) :808-839
[45]   Voice-sensitive brain networks encode talker-specific phonetic detail [J].
Myers, Emily B. ;
Theodore, Rachel M. .
BRAIN AND LANGUAGE, 2017, 165 :33-44
[46]   STATIC, DYNAMIC, AND RELATIONAL PROPERTIES IN VOWEL PERCEPTION [J].
NEAREY, TM .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1989, 85 (05) :2088-2113
[47]   The perceptual consequences of within-talker variability in fricative production [J].
Newman, RS ;
Clouse, SA ;
Burnham, JL .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 109 (03) :1181-1196
[48]   Sound identification in human auditory cortex: Differential contribution of local field potentials and high gamma power as revealed by direct intracranial recordings [J].
Nourski, Kirill V. ;
Steinschneider, Mitchell ;
Rhone, Ariane E. ;
Oya, Hiroyuki ;
Kawasaki, Hiroto ;
Howard, Matthew A., III ;
McMurray, Bob .
BRAIN AND LANGUAGE, 2015, 148 :37-50
[49]   Reconstructing Speech from Human Auditory Cortex [J].
Pasley, Brian N. ;
David, Stephen V. ;
Mesgarani, Nima ;
Flinker, Adeen ;
Shamma, Shihab A. ;
Crone, Nathan E. ;
Knight, Robert T. ;
Chang, Edward F. .
PLOS BIOLOGY, 2012, 10 (01)
[50]   Adaptation in the auditory system: an overview [J].
Perez-Gonzalez, David ;
Malmierca, Manuel S. .
FRONTIERS IN INTEGRATIVE NEUROSCIENCE, 2014, 8