Fringe projection-based single-shot 3D eye tracking using deep learning and computer graphics

被引：1

作者：

Zheng, Yi ^{[1
]}

Chao, Qing ^{[1
]}

An, Yatong ^{[1
]}

Hirsh, Seth ^{[1
]}

Fix, Alexander ^{[1
]}

机构：

[1] Meta Real Lab Res, Redmond, WA 98052 USA

来源：

OPTICAL ARCHITECTURES FOR DISPLAYS AND SENSING IN AUGMENTED, VIRTUAL, AND MIXED REALITY, AR, VR, MR IV | 2023年 / 12449卷

关键词：

eye tracking; fringe projection profilometry; 3D sensing; physically based rendering; deep learning;

D O I：

10.1117/12.2667763

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

High-accuracy and high-speed 3D sensing technology plays an essential role in VR eye tracking as it can build a bridge to connect the user with virtual worlds. In VR eye tracking, fringe projection profilometry (FPP) can avoid dependence on scene textures and provide accurate results in near-eye scenarios; however, phase-shifting based FPP faces challenges like motion artifacts and may not meet the low-latency requirements of eye tracking tasks. On the other hand, Fourier transform profilometry can achieve single-shot 3D sensing, but the system is highly impacted by the texture variations on the eye. As a solution to the challenges above, researchers have explored deep learning-based single-shot fringe projection 3D sensing techniques. However, building a training dataset is expensive, and without abundant data the model is difficult to make generalized. In this paper, we built a virtual fringe projection system along with photorealistic face and eye models to synthesize large amounts of training data. Therefore, we can reduce the cost and enhance the generalization ability of the convolutional neural network (CNN). The training data synthesizer utilizes physically based rendering (PBR) and achieves high photorealism. We demonstrate that PBR can simulate the complex double refraction of structured light due to corneas. To train the CNN, we adopted the idea of transfer learning, where the CNN is first trained by PBR-generated data, then trained with the real data. We tested the CNN on real data, and the predicted results demonstrate that the synthesized data enhances the performance of the model and achieves around 3.722 degree gaze accuracy and 0.5363 mm pupil position error on an unfamiliar participant.

引用

页数：11

共 26 条

[11] Calibration-free eye tracking by reconstruction of the pupil ellipse in 3D space [J].

Kohlbecher, Stefan ;

Bardins, Stanislavs ;

Bartl, Klaus ;

Schneider, Erich ;

Poitschke, Tony ;

Ablassmeier, Markus .

PROCEEDINGS OF THE EYE TRACKING RESEARCH AND APPLICATIONS SYMPOSIUM (ETRA 2008), 2008, :135-138

[12]

Leygue Adriene., 2022, Plane fit

[13] Computer vision-based concrete crack detection using U-net fully convolutional networks [J].

Liu, Zhenqing ;

Cao, Yiwen ;

Wang, Yize ;

Wang, Wei .

AUTOMATION IN CONSTRUCTION, 2019, 104 :129-139

[14]

LuxCoreRender, 2021, Luxcorerender wiki

[15]

Macartney C, 2018, Arxiv, DOI arXiv:1811.11307

[16] One-point Calibration Gaze Tracking Based on Eyeball Kinematics Using Stereo Cameras [J].

Nagamatsu, Takashi ;

Kamahara, Junzo ;

Iko, Takumi ;

Tanaka, Naoki .

PROCEEDINGS OF THE EYE TRACKING RESEARCH AND APPLICATIONS SYMPOSIUM (ETRA 2008), 2008, :95-98

[17]

Nair Nitinraj, 2020, ACM S EYE TRACK RES, P1

[18] U-Net: Convolutional Networks for Biomedical Image Segmentation [J].

Ronneberger, Olaf ;

Fischer, Philipp ;

Brox, Thomas .

MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 :234-241

[19] Deep neural networks for single shot structured light profilometry [J].

Van der Jeught, Sam ;

Dirckx, Joris J. J. .

OPTICS EXPRESS, 2019, 27 (12) :17091-17101

[20] Single-shot fringe projection profilometry based on deep learning and computer graphics [J].

Wang, Fanzhou ;

Wang, Chenxing ;

Guan, Qingze .

OPTICS EXPRESS, 2021, 29 (06) :8024-8040

← 1 2 3 →