DGaze: CNN-Based Gaze Prediction in Dynamic Scenes

被引：73

作者：

Hu, Zhiming ^{[1
]}

Li, Sheng ^{[1
]}

Zhang, Congyi ^{[1
,2
]}

Yi, Kangrui ^{[1
]}

Wang, Guoping ^{[1
]}

Manocha, Dinesh ^{[3
]}

机构：

[1] Peking Univ, Beijing, Peoples R China

[2] Univ Hong Kong, Hong Kong, Peoples R China

[3] Univ Maryland, College Pk, MD 20742 USA

来源：

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS | 2020年 / 26卷 / 05期

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Predictive models; Gaze tracking; Solid modeling; Head; Analytical models; Data models; Rendering (computer graphics); Gaze prediction; convolutional neural network; eye tracking; dynamic scene; gaze-contingent rendering; virtual reality; NEURAL-NETWORKS; SALIENCY; MODEL;

D O I：

10.1109/TVCG.2020.2973473

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

We conduct novel analyses of users' gaze behaviors in dynamic virtual scenes and, based on our analyses, we present a novel CNN-based model called DGaze for gaze prediction in HMD-based applications. We first collect 43 users' eye tracking data in 5 dynamic scenes under free-viewing conditions. Next, we perform statistical analysis of our data and observe that dynamic object positions, head rotation velocities, and salient regions are correlated with users' gaze positions. Based on our analysis, we present a CNN-based model (DGaze) that combines object position sequence, head velocity sequence, and saliency features to predict users' gaze positions. Our model can be applied to predict not only realtime gaze positions but also gaze positions in the near future and can achieve better performance than prior method. In terms of realtime prediction, DGaze achieves a 22.0% improvement over prior method in dynamic scenes and obtains an improvement of 9.5% in static scenes, based on using the angular distance as the evaluation metric. We also propose a variant of our model called DGaze_ET that can be used to predict future gaze positions with higher precision by combining accurate past gaze data gathered using an eye tracker. We further analyze our CNN architecture and verify the effectiveness of each component in our model. We apply DGaze to gaze-contingent rendering and a game, and also present the evaluation results from a user study.

引用

页码：1902 / 1911

页数：10

共 46 条

[1] Motion onset captures attention [J].

Abrams, RA ;

Christ, SE .

PSYCHOLOGICAL SCIENCE, 2003, 14 (05) :427-432

[2]

[Anonymous], 26 IEEE C VIRT REAL

[3]

[Anonymous], 2019, 26 IEEE C VIRT REAL

[4]

[Anonymous], ICML

[5]

[Anonymous], 2016, P ACM S APPL PERC, DOI [DOI 10.1145/2931002.2931011, 10.1145/2931002.2931011]

[6]

[Anonymous], 2000, THESIS PASADENA CALI

[7]

[Anonymous], 26 IEEE C VIRT REAL

[8]

[Anonymous], 2018, IEEE T IMAGE PROCESS

[9]

[Anonymous], P 12 EUR C VIS MED P

[10]

[Anonymous], 2008, JVRB J VIRTUAL REALI

← 1 2 3 4 5 →