Timing in audiovisual speech perception: A mini review and new psychophysical data

被引:17
|
作者
Venezia, Jonathan H. [1 ]
Thurman, Steven M. [2 ]
Matchin, William [3 ]
George, Sahara E. [4 ]
Hickok, Gregory [1 ]
机构
[1] Univ Calif Irvine, Dept Cognit Sci, Irvine, CA 92697 USA
[2] Univ Calif Los Angeles, Dept Psychol, Los Angeles, CA USA
[3] Univ Maryland, Dept Linguist, Baltimore, MD 21201 USA
[4] Univ Calif Irvine, Dept Anat & Neurobiol, Irvine, CA 92717 USA
关键词
Audiovisual speech; Multisensory integration; Prediction; Classification image; Timing; McGurk; Speech kinematics; MULTISENSORY INTEGRATION; VISUAL SPEECH; SPATIOTEMPORAL DYNAMICS; SUPERIOR COLLICULUS; MOVEMENT VELOCITY; TEMPORAL WINDOW; RECOGNITION; INFORMATION; TIME; SYNCHRONY;
D O I
10.3758/s13414-015-1026-y
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Recent influential models of audiovisual speech perception suggest that visual speech aids perception by generating predictions about the identity of upcoming speech sounds. These models place stock in the assumption that visual speech leads auditory speech in time. However, it is unclear whether and to what extent temporally-leading visual speech information contributes to perception. Previous studies exploring audiovisual-speech timing have relied upon psychophysical procedures that require artificial manipulation of cross-modal alignment or stimulus duration. We introduce a classification procedure that tracks perceptually relevant visual speech information in time without requiring such manipulations. Participants were shown videos of a McGurk syllable (auditory /apa/ + visual /aka/ = perceptual /ata/) and asked to perform phoneme identification (/apa/ yes-no). The mouth region of the visual stimulus was overlaid with a dynamic transparency mask that obscured visual speech in some frames but not others randomly across trials. Variability in participants' responses (similar to 35 % identification of /apa/ compared to similar to 5 % in the absence of the masker) served as the basis for classification analysis. The outcome was a high resolution spatiotemporal map of perceptually relevant visual features. We produced these maps for McGurk stimuli at different audiovisual temporal offsets (natural timing, 50-ms visual lead, and 100-ms visual lead). Briefly, temporally-leading (similar to 130 ms) visual information did influence auditory perception. Moreover, several visual features influenced perception of a single speech sound, with the relative influence of each feature depending on both its temporal relation to the auditory signal and its informational content.
引用
收藏
页码:583 / 601
页数:19
相关论文
共 50 条
  • [21] Spatial frequency requirements for audiovisual speech perception
    K. G. Munhall
    C. Kroos
    G. Jozan
    E. Vatikiotis-Bateson
    Perception & Psychophysics, 2004, 66 : 574 - 583
  • [22] Increases in sensory noise predict attentional disruptions to audiovisual speech perception
    Fisher, Victoria L.
    Dean, Cassandra L.
    Nave, Claire S.
    Parkins, Emma V.
    Kerkhoff, Willa G.
    Kwakye, Leslie D.
    FRONTIERS IN HUMAN NEUROSCIENCE, 2023, 16
  • [23] An assessment of behavioral dynamic information processing measures in audiovisual speech perception
    Altieri, Nicholas
    Townsend, James T.
    FRONTIERS IN PSYCHOLOGY, 2011, 2
  • [24] Audiovisual Speech Perception Benefits are Stable from Preschool through Adolescence
    Gijbels, Liesbeth
    Yeatman, Jason D.
    Lalonde, Kaylah
    Doering, Piper
    Lee, Adrian K. C.
    MULTISENSORY RESEARCH, 2024, 37 (4-5) : 317 - 340
  • [25] Audiovisual Speech Perception and Eye Gaze Behavior of Adults with Asperger Syndrome
    Satu Saalasti
    Jari Kätsyri
    Kaisa Tiippana
    Mari Laine-Hernandez
    Lennart von Wendt
    Mikko Sams
    Journal of Autism and Developmental Disorders, 2012, 42 : 1606 - 1615
  • [26] Effect of attentional load on audiovisual speech perception: evidence from ERPs
    Alsius, Agnes
    Moettoenen, Riikka
    Sams, Mikko E.
    Soto-Faraco, Salvador
    Tiippana, Kaisa
    FRONTIERS IN PSYCHOLOGY, 2014, 5
  • [27] Visual Timing Information in Audiovisual Speech Perception: Evidence from Lexical Tone Contour
    Xie, Hui
    Zeng, Biao
    Wang, Rui
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3781 - 3785
  • [28] Modeling the Development of Audiovisual Cue Integration in Speech Perception
    Getz, Laura M.
    Nordeen, Elke R.
    Vrabic, Sarah C.
    Toscano, Joseph C.
    BRAIN SCIENCES, 2017, 7 (03):
  • [29] Perception of Audiovisual Speech Produced by Human and Virtual Speaker
    Aller, Sven
    Meister, Einar
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2016, 289 : 31 - 38
  • [30] The impact of visual information in speech perception for individuals with hearing loss: a mini review
    Choi, Ahyeon
    Kim, Hayoon
    Jo, Mina
    Kim, Subeen
    Joung, Haesun
    Choi, Inyong
    Lee, Kyogu
    FRONTIERS IN PSYCHOLOGY, 2024, 15