Timing in audiovisual speech perception: A mini review and new psychophysical data

被引：17

作者：

Venezia, Jonathan H. ^{[1
]}

Thurman, Steven M. ^{[2
]}

Matchin, William ^{[3
]}

George, Sahara E. ^{[4
]}

Hickok, Gregory ^{[1
]}

机构：

[1] Univ Calif Irvine, Dept Cognit Sci, Irvine, CA 92697 USA

[2] Univ Calif Los Angeles, Dept Psychol, Los Angeles, CA USA

[3] Univ Maryland, Dept Linguist, Baltimore, MD 21201 USA

[4] Univ Calif Irvine, Dept Anat & Neurobiol, Irvine, CA 92717 USA

来源：

ATTENTION PERCEPTION & PSYCHOPHYSICS | 2016年 / 78卷 / 02期

关键词：

Audiovisual speech; Multisensory integration; Prediction; Classification image; Timing; McGurk; Speech kinematics; MULTISENSORY INTEGRATION; VISUAL SPEECH; SPATIOTEMPORAL DYNAMICS; SUPERIOR COLLICULUS; MOVEMENT VELOCITY; TEMPORAL WINDOW; RECOGNITION; INFORMATION; TIME; SYNCHRONY;

D O I：

10.3758/s13414-015-1026-y

中图分类号：

B84 [心理学];

学科分类号：

04 ; 0402 ;

摘要：

Recent influential models of audiovisual speech perception suggest that visual speech aids perception by generating predictions about the identity of upcoming speech sounds. These models place stock in the assumption that visual speech leads auditory speech in time. However, it is unclear whether and to what extent temporally-leading visual speech information contributes to perception. Previous studies exploring audiovisual-speech timing have relied upon psychophysical procedures that require artificial manipulation of cross-modal alignment or stimulus duration. We introduce a classification procedure that tracks perceptually relevant visual speech information in time without requiring such manipulations. Participants were shown videos of a McGurk syllable (auditory /apa/ + visual /aka/ = perceptual /ata/) and asked to perform phoneme identification (/apa/ yes-no). The mouth region of the visual stimulus was overlaid with a dynamic transparency mask that obscured visual speech in some frames but not others randomly across trials. Variability in participants' responses (similar to 35 % identification of /apa/ compared to similar to 5 % in the absence of the masker) served as the basis for classification analysis. The outcome was a high resolution spatiotemporal map of perceptually relevant visual features. We produced these maps for McGurk stimuli at different audiovisual temporal offsets (natural timing, 50-ms visual lead, and 100-ms visual lead). Briefly, temporally-leading (similar to 130 ms) visual information did influence auditory perception. Moreover, several visual features influenced perception of a single speech sound, with the relative influence of each feature depending on both its temporal relation to the auditory signal and its informational content.

引用

页码：583 / 601

页数：19

共 112 条

[1]

Abry C., 1996, Speechreading by humans and machines: Models, systems, and applications, P247

[2] SPEAKING RATE AND SPEECH MOVEMENT VELOCITY PROFILES [J].

ADAMS, SG ;

WEISMER, G ;

KENT, RD .

JOURNAL OF SPEECH AND HEARING RESEARCH, 1993, 36 (01) :41-54

[3] STIMULUS FEATURES IN SIGNAL DETECTION [J].

AHUMADA, A ;

LOVELL, J .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1971, 49 (06) :1751-&

[4] The ventriloquist effect results from near-optimal bimodal integration [J].

Alais, D ;

Burr, D .

CURRENT BIOLOGY, 2004, 14 (03) :257-262

[5] Bottom-up driven speechreading in a speechreading expert: The case of AA (JK023) [J].

Andersson, U ;

Lidestam, B .

EAR AND HEARING, 2005, 26 (02) :214-224

[6]

Arai T., 1997, EUROSPEECH

[7] Transitions in neural oscillations reflect prediction errors generated in audiovisual speech [J].

Arnal, Luc H. ;

Wyart, Valentin ;

Giraud, Anne-Lise .

NATURE NEUROSCIENCE, 2011, 14 (06) :797-U164

[8] Dual Neural Routing of Visual Facilitation in Speech Processing [J].

Arnal, Luc H. ;

Morillon, Benjamin ;

Kell, Christian A. ;

Giraud, Anne-Lise .

JOURNAL OF NEUROSCIENCE, 2009, 29 (43) :13445-13453

[9] Unraveling multisensory integration: patchy organization within human STS multisensory cortex [J].

Beauchamp, MS ;

Argall, BD ;

Bodurka, J ;

Duyn, JH ;

Martin, A .

NATURE NEUROSCIENCE, 2004, 7 (11) :1190-1192

[10] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].

BENJAMINI, Y ;

HOCHBERG, Y .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300

← 1 2 3 4 5 6 7 8 9 10 →