Eye Gaze and Perceptual Adaptation to Audiovisual Degraded Speech

被引:7
作者
Banks, Briony [1 ,5 ]
Gowen, Emma [1 ]
Munro, Kevin J. [2 ,3 ]
Adank, Patti [4 ]
机构
[1] Univ Manchester, Fac Biol Med & Hlth, Div Neurosci & Expt Psychol, Manchester, Lancs, England
[2] Univ Manchester, Fac Biol Med & Hlth, Manchester Ctr Audiol & Deafness, Manchester, Lancs, England
[3] Manchester Univ NHS Fdn Trust, Manchester Acad Hlth Sci Ctr, Manchester, Lancs, England
[4] UCL, Speech Hearing & Phonet Sci, London, England
[5] Univ Lancaster, Dept Psychol, Lancaster, England
来源
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH | 2021年 / 64卷 / 09期
基金
英国生物技术与生命科学研究理事会;
关键词
VISUAL SPEECH; ADVERSE CONDITIONS; COMPRESSED SPEECH; VOCODED SPEECH; ATTENTION; INTELLIGIBILITY; INFORMATION; TALKER; FACE; COMPREHENSION;
D O I
10.1044/2021_JSLHR-21-00106
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Purpose: Visual cues from a speaker's face may benefit perceptual adaptation to degraded speech, but current evidence is limited. We aimed to replicate results from previous studies to establish the extent to which visual speech cues can lead to greater adaptation over time, extending existing results to a real-time adaptation paradigm (i.e., without a separate training period). A second aim was to investigate whether eye gaze patterns toward the speaker's mouth were related to better perception, hypothesizing that listeners who looked more at the speaker's mouth would show greater adaptation. Method: A group of listeners (n = 30) was presented with 90 noise-vocoded sentences in audiovisual format, whereas a control group (n = 29) was presented with the audio signal only. Recognition accuracy was measured throughout and eye tracking was used to measure fixations toward the speaker's eyes and mouth in the audiovisual group. Results: Previous studies were partially replicated: The audiovisual group had better recognition throughout and adapted slightly more rapidly, but both groups showed an equal amount of improvement overall. Longer fixations on the speaker's mouth in the audiovisual group were related to better overall accuracy. An exploratory analysis further demonstrated that the duration of fixations to the speaker's mouth decreased over time. Conclusions: The results suggest that visual cues may not benefit adaptation to degraded speech as much as previously thought. Longer fixations on a speaker's mouth may play a role in successfully decoding visual speech cues; however, this will need to be confirmed in future research to fully understand how patterns of eye gaze are related to audiovisual speech recognition. All materials, data, and code are available at https://osf.io/ 2wqkf/.
引用
收藏
页码:3432 / 3445
页数:14
相关论文
共 71 条
[1]   Comprehension of a Novel Accent by Young and Older Listeners [J].
Adank, Patti ;
Janse, Esther .
PSYCHOLOGY AND AGING, 2010, 25 (03) :736-740
[2]   Forty Years After Hearing Lips and Seeing Voices: the McGurk Effect Revisited [J].
Alsius, Agnes ;
Pare, Martin ;
Munhall, Kevin G. .
MULTISENSORY RESEARCH, 2018, 31 (1-2) :111-144
[3]  
[Anonymous], 1969, IEEE T ACOUST SPEECH, VAU17, P225
[4]  
Banks B, 2015, FRONT HUM NEUROSCI, V9, DOI [10.3339/fnhum.2015.00422, 10.3389/fnhum.2015.00422]
[5]   Cognitive predictors of perceptual adaptation to accented speech [J].
Banks, Briony ;
Gowen, Emma ;
Munro, Kevin J. ;
Adank, Patti .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2015, 137 (04) :2015-2024
[6]   Language familiarity modulates relative attention to the eyes and mouth of a talker [J].
Barenholtz, Elan ;
Mavica, Lauren ;
Lewkowicz, David J. .
COGNITION, 2016, 147 :100-105
[7]  
Bench J, 1979, Br J Audiol, V13, P108, DOI 10.3109/03005367909078884
[8]   Auditory perceptual learning for speech perception can be enhanced by audiovisual training [J].
Bernstein, Lynne E. ;
Auer, Edward T., Jr. ;
Eberhardt, Silvio P. ;
Jiang, Jintao .
FRONTIERS IN NEUROSCIENCE, 2013, 7
[9]   Human Social Attention A New Look at Past, Present, and Future Investigations [J].
Birmingham, Elina ;
Kingstone, Alan .
YEAR IN COGNITIVE NEUROSCIENCE 2009, 2009, 1156 :118-140
[10]   Highly proficient L2 speakers still need to attend to a talker's mouth when processing L2 speech [J].
Birules, Joan ;
Bosch, Laura ;
Pons, Ferran ;
Lewkowicz, David J. .
LANGUAGE COGNITION AND NEUROSCIENCE, 2020, 35 (10) :1314-1325