Investigating joint attention mechanisms through spoken human-robot interaction

被引:74
作者
Staudte, Maria [1 ]
Crocker, Matthew W. [1 ]
机构
[1] Univ Saarland, Dept Computat Linguist Campus, D-66123 Saarbrucken, Germany
关键词
Utterance comprehension; Referential gaze; Joint attention; Human-robot interaction; Referential intention; Gaze; Situated language processing; Reference resolution; EYE-MOVEMENTS; GAZE-DIRECTION; LANGUAGE; SPEAKING; AUTISM; COMPREHENSION; PERCEPTION; TRACKING; CHILDREN; OBJECTS;
D O I
10.1016/j.cognition.2011.05.005
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Referential gaze during situated language production and comprehension is tightly coupled with the unfolding speech stream (Griffin, 2001; Meyer, Sleiderink, & Levelt, 1998; Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995). In a shared environment, utterance comprehension may further be facilitated when the listener can exploit the speaker's focus of (visual) attention to anticipate, ground, and disambiguate spoken references. To investigate the dynamics of such gaze-following and its influence on utterance comprehension in a controlled manner, we use a human-robot interaction setting. Specifically, we hypothesize that referential gaze is interpreted as a cue to the speaker's referential intentions which facilitates or disrupts reference resolution. Moreover, the use of a dynamic and yet extremely controlled gaze cue enables us to shed light on the simultaneous and incremental integration of the unfolding speech and gaze movement. We report evidence from two eye-tracking experiments in which participants saw videos of a robot looking at and describing objects in a scene. The results reveal a quantified benefit-disruption spectrum of gaze on utterance comprehension and, further, show that gaze is used, even during the initial movement phase, to restrict the spatial domain of potential referents. These findings more broadly suggest that people treat artificial agents similar to human agents and, thus, validate such a setting for further explorations of joint attention mechanisms. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:268 / 291
页数:24
相关论文
共 70 条
[1]   Perceived gaze direction and the processing of facial displays of emotion [J].
Adams, RB ;
Kleck, RE .
PSYCHOLOGICAL SCIENCE, 2003, 14 (06) :644-647
[2]   Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models [J].
Allopenna, PD ;
Magnuson, JS ;
Tanenhaus, MK .
JOURNAL OF MEMORY AND LANGUAGE, 1998, 38 (04) :419-439
[3]  
Altmann G., 2004, The interface of language, vision, and action: Eye movements and the visual world, P347, DOI DOI 10.1016/J.JML.2006.12.004
[4]   Incremental interpretation at verbs: restricting the domain of subsequent reference [J].
Altmann, GTM ;
Kamide, Y .
COGNITION, 1999, 73 (03) :247-264
[5]  
[Anonymous], 1997, Mindblindness: An essay on autism and theory of mind. Mindblindness: An essay on autism and theory of mind, DOI 10.7551/mitpress/4635.001.0001
[6]  
[Anonymous], 1999, Proceedings of the SIGCHI conference on human factors in computing systems, DOI [DOI 10.1145/302979.303150, 10.1145/302979.303150]
[7]   EYE-CONTACT, DISTANCE AND AFFILIATION [J].
ARGYLE, M ;
DEAN, J .
SOCIOMETRY, 1965, 28 (03) :289-304
[8]   Mixed-effects modeling with crossed random effects for subjects and items [J].
Baayen, R. H. ;
Davidson, D. J. ;
Bates, D. M. .
JOURNAL OF MEMORY AND LANGUAGE, 2008, 59 (04) :390-412
[9]   Using pointing and describing to achieve joint focus of attention in dialogue [J].
Bangerter, A .
PSYCHOLOGICAL SCIENCE, 2004, 15 (06) :415-419
[10]  
BaronCohen S, 1997, CHILD DEV, V68, P48