Audio-visual onset differences are used to determine syllable identity for ambiguous audio-visual stimulus pairs

被引:30
作者
ten Oever, Sanne [1 ]
Sack, Alexander T. [1 ]
Wheat, Katherine L. [1 ]
Bien, Nina [1 ,2 ]
van Atteveldt, Nienke [1 ,3 ]
机构
[1] Maastricht Univ, Fac Psychol & Neurosci, NL-6200 MD Maastricht, Netherlands
[2] Univ Luxembourg, EMACS Res Unit, Luxembourg, Luxembourg
[3] Netherlands Inst Neurosci, Neuroimaging & Neuromodeling Grp, Amsterdam, Netherlands
来源
FRONTIERS IN PSYCHOLOGY | 2013年 / 4卷
关键词
audiovisual; temporal cues; audio-visual onset differences; content cues; predictability; detection; MULTISENSORY INTEGRATION; VISUAL SPEECH; CROSSMODAL BINDING; NEURONAL OSCILLATIONS; AUDITORY-CORTEX; PERCEPTION; SOUNDS; MODULATION; SYNCHRONY; VOICES;
D O I
10.3389/fpsyg.2013.00331
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Content and temporal cues have been shown to interact during audio-visual (AV) speech identification. Typically, the most reliable unimodal cue is used more strongly to identify specific speech features; however, visual cues are only used if the AV stimuli are presented within a certain temporal window of integration (TWI). This suggests that temporal cues denote whether unimodal stimuli belong together, that is, whether they should be integrated. It is not known whether temporal cues also provide information about the identity of a syllable. Since spoken syllables have naturally varying AV onset asynchronies, we hypothesize that for suboptimal AV cues presented within the TWI, information about the natural AV onset differences can aid in speech identification. To test this, we presented low-intensity auditory syllables concurrently with visual speech signals, and varied the stimulus onset asynchronies (SOA) of the AV pair, while participants were instructed to identify the auditory syllables. We revealed that specific speech features (e.g., voicing) were identified by relying primarily on one modality (e.g., auditory). Additionally, we showed a wide window in which visual information influenced auditory perception, that seemed even wider for congruent stimulus pairs. Finally, we found a specific response pattern across the SOA range for syllables that were not reliably identified by the unimodal cues, which we explained as the result of the use of natural onset differences between AV speech signals. This indicates that temporal cues not only provide information about the temporal integration of AV stimuli, but additionally convey information about the identity of AV pairs. These results provide a detailed behavioral basis for further neuro-imaging and stimulation studies to unravel the neurofunctional mechanisms of the audio-visual-temporal interplay within speech perception.
引用
收藏
页数:13
相关论文
共 67 条
  • [1] Factors influencing audiovisual fission and fusion illusions
    Andersen, TS
    Tiippana, K
    Sams, M
    [J]. COGNITIVE BRAIN RESEARCH, 2004, 21 (03): : 301 - 308
  • [2] Transitions in neural oscillations reflect prediction errors generated in audiovisual speech
    Arnal, Luc H.
    Wyart, Valentin
    Giraud, Anne-Lise
    [J]. NATURE NEUROSCIENCE, 2011, 14 (06) : 797 - U164
  • [3] Enhanced visual speech perception in individuals with early-onset hearing impairment
    Auer, Edward T., Jr.
    Bernstein, Lynne E.
    [J]. JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2007, 50 (05): : 1157 - 1165
  • [4] Speechreading and the structure of the lexicon: Computationally modeling the effects of reduced phonetic distinctiveness on lexical uniqueness
    Auer, ET
    Bernstein, LE
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1997, 102 (06) : 3704 - 3710
  • [5] Integration of auditory and visual information about objects in superior temporal sulcus
    Beauchamp, MS
    Lee, KE
    Argall, BD
    Martin, A
    [J]. NEURON, 2004, 41 (05) : 809 - 823
  • [6] Auditory speech detection in noise enhanced by lipreading
    Bernstein, LE
    Auer, ET
    Takayanagi, S
    [J]. SPEECH COMMUNICATION, 2004, 44 (1-4) : 5 - 18
  • [7] The sound of size Crossmodal binding in pitch-size synesthesia: A combined TMS, EEG and psychophysics study
    Bien, Nina
    ten Oever, Sanne
    Goebel, Rainer
    Sack, Alexander T.
    [J]. NEUROIMAGE, 2012, 59 (01) : 663 - 672
  • [8] Task-irrelevant visual letters interact with the processing of speech sounds in heteromodal and unimodal cortex
    Blau, Vera
    van Atteveldt, Nienke
    Formisano, Elia
    Goebel, Rainer
    Blomert, Leo
    [J]. EUROPEAN JOURNAL OF NEUROSCIENCE, 2008, 28 (03) : 500 - 509
  • [9] Multisensory integration sites identified by perception of spatial wavelet filtered visual speech gesture information
    Callan, DE
    Jones, JA
    Munhall, K
    Kroos, C
    Callan, AM
    Vatikiotis-Bateson, E
    [J]. JOURNAL OF COGNITIVE NEUROSCIENCE, 2004, 16 (05) : 805 - 816
  • [10] Calvert G.A., 2004, HDB MULTISENSORY PRO, P483