Simple Acoustic Features Can Explain Phoneme-Based Predictions of Cortical Responses to Speech

被引:95
作者
Daube, Christoph [1 ]
Ince, Robin A. A. [1 ]
Gross, Joachim [1 ,2 ]
机构
[1] Univ Glasgow, Inst Neurosci & Psychol, 62 Hillhead St, Glasgow G12 8QB, Lanark, Scotland
[2] Univ Munster, Inst Biomagnetism & Biosignalanal, Malmedyweg 15, D-48149 Munster, Germany
基金
英国惠康基金;
关键词
INFORMATION; DYNAMICS; COMPREHENSION; OSCILLATIONS; PRINCIPLES; FREQUENCY; ENVELOPE; MODELS; MEG;
D O I
10.1016/j.cub.2019.04.067
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
When we listen to speech, we have to make sense of a waveform of sound pressure. Hierarchical models of speech perception assume that, to extract semantic meaning, the signal is transformed into unknown, intermediate neuronal representations. Traditionally, studies of such intermediate representations are guided by linguistically defined concepts, such as phonemes. Here, we argue that in order to arrive at an unbiased understanding of the neuronal responses to speech, we should focus instead on representations obtained directly from the stimulus. We illustrate our view with a data-driven, information theoretic analysis of a dataset of 24 young, healthy humans who listened to a 1 h narrative while their magnetoencephalogram (MEG) was recorded. We find that two recent results, the improved performance of an encoding model in which annotated linguistic and acoustic features were combined and the decoding of phoneme subgroups from phoneme-locked responses, can be explained by an encoding model that is based entirely on acoustic features. These acoustic features capitalize on acoustic edges and outperform Gabor-filtered spectrograms, which can explicitly describe the spectrotemporal characteristics of individual phonemes. By replicating our results in publicly available electroencephalography (EEG) data, we conclude that models of brain responses based on linguistic features can serve as excellent benchmarks. However, we believe that in order to further our understanding of human cortical responses to speech, we should also explore low-level and parsimonious explanations for apparent high-level phenomena.
引用
收藏
页码:1924 / +
页数:23
相关论文
共 93 条
[1]   Machine learning for neuroirnaging with scikit-learn [J].
Abraham, Alexandre ;
Pedregosa, Fabian ;
Eickenberg, Michael ;
Gervais, Philippe ;
Mueller, Andreas ;
Kossaifi, Jean ;
Gramfort, Alexandre ;
Thirion, Bertrand ;
Varoquaux, Gael .
FRONTIERS IN NEUROINFORMATICS, 2014, 8
[2]  
Acerbi L., 2017, ADV NEURAL INFORM PR, P1834
[3]   Neural Tuning to Low-Level Features of Speech throughout the Perisylvian Cortex [J].
Berezutskaya, Julia ;
Freudenburg, Zachary V. ;
Guclu, Umut ;
van Gerven, Marcel A. J. ;
Ramsey, Nick F. .
JOURNAL OF NEUROSCIENCE, 2017, 37 (33) :7906-7920
[4]   Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario [J].
Biesmans, Wouter ;
Das, Neetha ;
Francart, Tom ;
Bertrand, Alexander .
IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2017, 25 (05) :402-412
[5]  
Boersma P., 2001, GLOT INT, V5, P341
[6]   Focal versus distributed temporal cortex activity for speech sound category assignment [J].
Bouton, Sophie ;
Chambon, Valerian ;
Tyrand, Remi ;
Guggisberg, Adrian G. ;
Seeck, Margitta ;
Karkar, Sami ;
de Ville, Dimitri Van ;
Giraud, Anne-Lise .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2018, 115 (06) :E1299-E1308
[7]   The psychophysics toolbox [J].
Brainard, DH .
SPATIAL VISION, 1997, 10 (04) :433-436
[8]  
Brette R., 2018, BIORXIV, DOI [10.1101/168237, DOI 10.1101/168237]
[9]   Rapid Transformation from Auditory to Linguistic Representations of Continuous Speech [J].
Brodbeck, Christian ;
Hong, L. Elliot ;
Simon, Jonathan Z. .
CURRENT BIOLOGY, 2018, 28 (24) :3976-+
[10]   Neural source dynamics of brain responses to continuous stimuli: Speech processing from acoustics to comprehension [J].
Brodbeck, Christian ;
Presacco, Alessandro ;
Simon, Jonathan Z. .
NEUROIMAGE, 2018, 172 :162-174