Gaze Estimation by Exploring Two-Eye Asymmetry

被引：122

作者：

Cheng, Yihua ^{[1
]}

Zhang, Xucong ^{[2
]}

Lu, Feng ^{[1
,3
,4
]}

Sato, Yoichi ^{[5
]}

机构：

[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Sch Comp Sci & Engn, Beijing 100191, Peoples R China

[2] Swiss Fed Inst Technol, Dept Comp Sci, CH-8006 Zurich, Switzerland

[3] Peng Cheng Lab, Shenzhen 518000, Peoples R China

[4] Beihang Univ, Beijing Adv Innovat Ctr Big Data Based Precis, Beijing 100191, Peoples R China

[5] Univ Tokyo, Inst Ind Sci, Tokyo 1538505, Japan

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2020年 / 29卷

基金：

中国国家自然科学基金;

关键词：

Gaze estimation; asymmetric regression; evaluation network; eye appearance; TRACKING TECHNIQUES; APPEARANCE; PREDICTION;

D O I：

10.1109/TIP.2020.2982828

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Eye gaze estimation is increasingly demanded by recent intelligent systems to facilitate a range of interactive applications. Unfortunately, learning the highly complicated regression from a single eye image to the gaze direction is not trivial. Thus, the problem is yet to be solved efficiently. Inspired by the two-eye asymmetry as two eyes of the same person may appear uneven, we propose the face-based asymmetric regression-evaluation network (FARE-Net) to optimize the gaze estimation results by considering the difference between left and right eyes. The proposed method includes one face-based asymmetric regression network (FAR-Net) and one evaluation network (E-Net). The FAR-Net predicts 3D gaze directions for both eyes and is trained with the asymmetric mechanism, which asymmetrically weights and sums the loss generated by two-eye gaze directions. With the asymmetric mechanism, the FAR-Net utilizes the eyes that can achieve high performance to optimize network. The E-Net learns the reliabilities of two eyes to balance the learning of the asymmetric mechanism and symmetric mechanism. Our FARE-Net achieves leading performances on MPIIGaze, EyeDiap and RT-Gene datasets. Additionally, we investigate the effectiveness of FARE-Net by analyzing the distribution of errors and ablation study.

引用

页码：5259 / 5272

页数：14

共 59 条

[1]

Abadi M., 2015, TENSORFLOW LARGE SCA, DOI DOI 10.5431/ARAMIT5201

[2]

Baluja S., 1994, Advances Neural Information Processing Systems, P1

[3]

Chen Y., 2020, P AAAI C ART INT AAA, P1

[4] Gaze Location Prediction for Broadcast Football Video [J].

Cheng, Qin ;

Agrafiotis, Dimitris ;

Achim, Alin M. ;

Bull, David R. .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (12) :4918-4929

[5] Appearance-Based Gaze Estimation via Evaluation-Guided Asymmetric Regression [J].

Cheng, Yihua ;

Lu, Feng ;

Zhang, Xucong .

COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :105-121

[6] Monocular Free-head 3D Gaze Tracking with Deep Learning and Geometry Constraints [J].

Deng, Haoping ;

Zhu, Wangjiang .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3162-3171

[7] Quadruplet Network With One-Shot Learning for Fast Visual Object Tracking [J].

Dong, Xingping ;

Shen, Jianbing ;

Wu, Dongming ;

Guo, Kan ;

Jin, Xiaogang ;

Porikli, Fatih .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (07) :3516-3527

[8] A breadth-first survey of eye-tracking applications [J].

Duchowski, AT .

BEHAVIOR RESEARCH METHODS INSTRUMENTS & COMPUTERS, 2002, 34 (04) :455-470

[9] Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground [J].

Fan, Deng-Ping ;

Cheng, Ming-Ming ;

Liu, Jiang-Jiang ;

Gao, Shang-Hua ;

Hou, Qibin ;

Borji, Ali .

COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 :196-212

[10] Shifting More Attention to Video Salient Object Detection [J].

Fan, Deng-Ping ;

Wang, Wenguan ;

Cheng, Ming-Ming ;

Shen, Jianbing .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8546-8556

← 1 2 3 4 5 6 →