RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments

被引:237
作者
Fischer, Tobias [1 ]
Chang, Hyung Jin [1 ]
Demiris, Yiannis [1 ]
机构
[1] Imperial Coll London, Dept Elect & Elect Engn, Personal Robot Lab, London, England
来源
COMPUTER VISION - ECCV 2018, PT X | 2018年 / 11214卷
基金
欧盟地平线“2020”;
关键词
Gaze estimation; Gaze dataset; Convolutional neural network; Semantic inpainting; Eyetracking glasses; APPEARANCE;
D O I
10.1007/978-3-030-01249-6_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we consider the problem of robust gaze estimation in natural environments. Large camera-to-subject distances and high variations in head pose and eye gaze angles are common in such environments. This leads to two main shortfalls in state-of-the-art methods for gaze estimation: hindered ground truth gaze annotation and diminished gaze estimation accuracy as image resolution decreases with distance. We first record a novel dataset of varied gaze and head pose images in a natural environment, addressing the issue of ground truth annotation by measuring head pose using a motion capture system and eye gaze using mobile eyetracking glasses. We apply semantic image inpainting to the area covered by the glasses to bridge the gap between training and testing images by removing the obtrusiveness of the glasses. We also present a new real-time algorithm involving appearance-based deep convolutional neural networks with increased capacity to cope with the diverse images in the new dataset. Experiments with this network architecture are conducted on a number of diverse eye-gaze datasets including our own, and in cross dataset evaluations. We demonstrate state-of-the-art performance in terms of estimation accuracy in all experiments, and the architecture performs well even on lower resolution images.
引用
收藏
页码:339 / 357
页数:19
相关论文
共 61 条
[1]  
[Anonymous], 2013, P 30 INT C MACH LEAR
[2]  
[Anonymous], 2014, PROC S EYE TRACK RES
[3]  
[Anonymous], 2005, P BRIT MACH VIS C BM
[4]  
[Anonymous], 2013, PROC 20 ANN ACM S US
[5]   Filling-in by joint interpolation of vector fields and gray levels [J].
Ballester, C ;
Bertalmio, M ;
Caselles, V ;
Sapiro, G ;
Verdera, J .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2001, 10 (08) :1200-1211
[6]   Constrained Local Neural Fields for robust facial landmark detection in the wild [J].
Baltrusaitis, Tadas ;
Robinson, Peter ;
Morency, Louis-Philippe .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2013, :354-361
[7]  
Baltrusaitis T, 2012, PROC CVPR IEEE, P2610, DOI 10.1109/CVPR.2012.6247980
[8]   PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing [J].
Barnes, Connelly ;
Shechtman, Eli ;
Finkelstein, Adam ;
Goldman, Dan B. .
ACM TRANSACTIONS ON GRAPHICS, 2009, 28 (03)
[9]   Image inpainting [J].
Bertalmio, M ;
Sapiro, G ;
Caselles, V ;
Ballester, C .
SIGGRAPH 2000 CONFERENCE PROCEEDINGS, 2000, :417-424
[10]   A METHOD FOR REGISTRATION OF 3-D SHAPES [J].
BESL, PJ ;
MCKAY, ND .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1992, 14 (02) :239-256