RETRACTED: The cortical representation of the speech envelope is earlier for audiovisual speech than audio speech (Retracted article. See vol. 112, pg. 2667, 2014)

被引:11
作者
Crosse, Michael J. [1 ,2 ]
Lalor, Edmund C. [1 ,2 ,3 ]
机构
[1] Univ Dublin Trinity Coll, Sch Engn, Dublin 2, Ireland
[2] Univ Dublin Trinity Coll, Trinity Ctr Bioengn, Dublin 2, Ireland
[3] Univ Dublin Trinity Coll, Trinity Coll Inst Neurosci, Dublin 2, Ireland
基金
爱尔兰科学基金会;
关键词
multisensory integration; analysis-by-synthesis; latency; EEG; TRF; HUMAN AUDITORY-CORTEX; MULTISENSORY INTEGRATION; TEMPORAL ENVELOPE; PERCEPTION; RESPONSES; SYSTEM; SOUNDS; BRAIN; BASES;
D O I
10.1152/jn.00690.2013
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Visual speech can greatly enhance a listener's comprehension of auditory speech when they are presented simultaneously. Efforts to determine the neural underpinnings of this phenomenon have been hampered by the limited temporal resolution of hemodynamic imaging and the fact that EEG and magnetoencephalographic data are usually analyzed in response to simple, discrete stimuli. Recent research has shown that neuronal activity in human auditory cortex tracks the envelope of natural speech. Here, we exploit this finding by estimating a linear forward-mapping between the speech envelope and EEG data and show that the latency at which the envelope of natural speech is represented in cortex is shortened by > 10 ms when continuous audiovisual speech is presented compared with audio-only speech. In addition, we use a reverse-mapping approach to reconstruct an estimate of the speech stimulus from the EEG data and, by comparing the bimodal estimate with the sum of the unimodal estimates, find no evidence of any nonlinear additive effects in the audiovisual speech condition. These findings point to an underlying mechanism that could account for enhanced comprehension during audiovisual speech. Specifically, we hypothesize that low-level acoustic features that are temporally coherent with the preceding visual stream may be synthesized into a speech object at an earlier latency, which may provide an extended period of low-level processing before extraction of semantic information.
引用
收藏
页码:1400 / 1408
页数:9
相关论文
共 40 条
[1]  
[Anonymous], 1987, Hearing by eye
[2]   Dual Neural Routing of Visual Facilitation in Speech Processing [J].
Arnal, Luc H. ;
Morillon, Benjamin ;
Kell, Christian A. ;
Giraud, Anne-Lise .
JOURNAL OF NEUROSCIENCE, 2009, 29 (43) :13445-13453
[3]  
Barjatya A., 2004, IEEE T EVOLUTION COM, V8, P225, DOI DOI 10.1109/TEVC.2004.826069
[4]   Natural vision reveals regional specialization to local motion and to contrast-invariant, global flow in the human brain [J].
Bartels, A. ;
Zeki, S. ;
Logothetis, N. K. .
CEREBRAL CORTEX, 2008, 18 (03) :705-717
[5]   Integration of auditory and visual information about objects in superior temporal sulcus [J].
Beauchamp, MS ;
Lee, KE ;
Argall, BD ;
Martin, A .
NEURON, 2004, 41 (05) :809-823
[6]   Bimodal speech: early suppressive visual effects in human auditory cortex [J].
Besle, J ;
Fort, A ;
Delpuech, C ;
Giard, MH .
EUROPEAN JOURNAL OF NEUROSCIENCE, 2004, 20 (08) :2225-2234
[7]  
Brandwein AB, 2013, CEREB CORTEX, V23, P1329, DOI [10.1093/cercor/bhs109, 10.1093/cercor/bht213]
[8]   The processing of audio-visual speech: empirical and neural bases [J].
Campbell, Ruth .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2008, 363 (1493) :1001-1010
[9]   Dynamic faces speed up the onset of auditory cortical spiking responses during vocal detection [J].
Chandrasekaran, Chandramouli ;
Lemus, Luis ;
Ghazanfar, Asif A. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (48) :E4668-E4677
[10]   The Natural Statistics of Audiovisual Speech [J].
Chandrasekaran, Chandramouli ;
Trubanova, Andrea ;
Stillittano, Sebastien ;
Caplier, Alice ;
Ghazanfar, Asif A. .
PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (07)