Gaze Estimation by Exploring Two-Eye Asymmetry

被引:122
作者
Cheng, Yihua [1 ]
Zhang, Xucong [2 ]
Lu, Feng [1 ,3 ,4 ]
Sato, Yoichi [5 ]
机构
[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
[2] Swiss Fed Inst Technol, Dept Comp Sci, CH-8006 Zurich, Switzerland
[3] Peng Cheng Lab, Shenzhen 518000, Peoples R China
[4] Beihang Univ, Beijing Adv Innovat Ctr Big Data Based Precis, Beijing 100191, Peoples R China
[5] Univ Tokyo, Inst Ind Sci, Tokyo 1538505, Japan
基金
中国国家自然科学基金;
关键词
Gaze estimation; asymmetric regression; evaluation network; eye appearance; TRACKING TECHNIQUES; APPEARANCE; PREDICTION;
D O I
10.1109/TIP.2020.2982828
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Eye gaze estimation is increasingly demanded by recent intelligent systems to facilitate a range of interactive applications. Unfortunately, learning the highly complicated regression from a single eye image to the gaze direction is not trivial. Thus, the problem is yet to be solved efficiently. Inspired by the two-eye asymmetry as two eyes of the same person may appear uneven, we propose the face-based asymmetric regression-evaluation network (FARE-Net) to optimize the gaze estimation results by considering the difference between left and right eyes. The proposed method includes one face-based asymmetric regression network (FAR-Net) and one evaluation network (E-Net). The FAR-Net predicts 3D gaze directions for both eyes and is trained with the asymmetric mechanism, which asymmetrically weights and sums the loss generated by two-eye gaze directions. With the asymmetric mechanism, the FAR-Net utilizes the eyes that can achieve high performance to optimize network. The E-Net learns the reliabilities of two eyes to balance the learning of the asymmetric mechanism and symmetric mechanism. Our FARE-Net achieves leading performances on MPIIGaze, EyeDiap and RT-Gene datasets. Additionally, we investigate the effectiveness of FARE-Net by analyzing the distribution of errors and ablation study.
引用
收藏
页码:5259 / 5272
页数:14
相关论文
共 59 条
[1]  
Abadi M., 2015, TENSORFLOW LARGE SCA, DOI DOI 10.5431/ARAMIT5201
[2]  
Baluja S., 1994, Advances Neural Information Processing Systems, P1
[3]  
Chen Y., 2020, P AAAI C ART INT AAA, P1
[4]   Gaze Location Prediction for Broadcast Football Video [J].
Cheng, Qin ;
Agrafiotis, Dimitris ;
Achim, Alin M. ;
Bull, David R. .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (12) :4918-4929
[5]   Appearance-Based Gaze Estimation via Evaluation-Guided Asymmetric Regression [J].
Cheng, Yihua ;
Lu, Feng ;
Zhang, Xucong .
COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :105-121
[6]   Monocular Free-head 3D Gaze Tracking with Deep Learning and Geometry Constraints [J].
Deng, Haoping ;
Zhu, Wangjiang .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3162-3171
[7]   Quadruplet Network With One-Shot Learning for Fast Visual Object Tracking [J].
Dong, Xingping ;
Shen, Jianbing ;
Wu, Dongming ;
Guo, Kan ;
Jin, Xiaogang ;
Porikli, Fatih .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (07) :3516-3527
[8]   A breadth-first survey of eye-tracking applications [J].
Duchowski, AT .
BEHAVIOR RESEARCH METHODS INSTRUMENTS & COMPUTERS, 2002, 34 (04) :455-470
[9]   Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground [J].
Fan, Deng-Ping ;
Cheng, Ming-Ming ;
Liu, Jiang-Jiang ;
Gao, Shang-Hua ;
Hou, Qibin ;
Borji, Ali .
COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 :196-212
[10]   Shifting More Attention to Video Salient Object Detection [J].
Fan, Deng-Ping ;
Wang, Wenguan ;
Cheng, Ming-Ming ;
Shen, Jianbing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8546-8556