Human gaze prediction for 3D light field display based on multi-attention fusion network

被引：0

作者：

Zhao, Meng ^{[1
]}

Yan, Binbin ^{[1
]}

Chen, Shuo ^{[1
]}

Guo, Xiao ^{[1
]}

Li, Ningchi ^{[1
]}

Chen, Duo ^{[1
]}

Wang, Kuiru ^{[1
]}

Sang, Xinzhu ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, State Key Lab Informat Photon & Opt Commun, POB 72, Beijing 100876, Peoples R China

来源：

OPTICS COMMUNICATIONS | 2024年 / 560卷

基金：

中国国家自然科学基金;

关键词：

Human eye fixation; Convolutional neural network; Multi -view image; 3D light field display; SALIENCY; MODEL;

D O I：

10.1016/j.optcom.2024.130458

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

Existing methods for simulating human visual attention primarily focus on 2D displays and limited research has been conducted on predicting visual attention in three-dimensional (3D) light field content. 3D light field displays provide a heightened sense of stereoscopic realism to viewers. To ensure that the content of the 3D light field display appears more consistent with human visual characteristics, we proposed a novel method for predicting human eye fixation in 3D light field display images. Firstly, we collected real eye movement data and utilized it to create an eye movement dataset based on 3D light field display images. This solves the problem of missing datasets in the field of human gaze based on three-dimensional light field images. Then, we proposed a convolutional neural network model with multiple inputs and outputs, integrating attention modules. This model was trained and used to predict eye fixation within the constructed eye movement dataset. A correlation exists between predicted human gaze of multiple distinct views of same light field image. Finally, we predicted the human gaze area of light field multi-view images based on our model. Experimental results demonstrate that our model accurately predicts human gaze regions across different views of a 3D light field image. The human gaze predicted by the model on each view is basically consistent and relatively accurate. By leveraging proposed method, we can effectively anticipate where viewers will focus their attention on the 3D light field display, which is beneficial for targeted improvement of 3D light field display content.

引用

页数：8

共 27 条

[1]

[Anonymous], 2005, Advances in neural information processing systems

[2]

[Anonymous], 2019, Contextual Encoder-Decoder Network for Visual Saliency Prediction

[3]

[Anonymous], 2006, Advances in neural information processing systems

[4]

Borji A, 2012, PROC CVPR IEEE, P470, DOI 10.1109/CVPR.2012.6247710

[5] A Computational Model for Stereoscopic Visual Saliency Prediction [J].

Cheng, Hao ;

Zhang, Jian ;

Wu, Qiang ;

An, Ping .

IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (03) :678-689

[6] Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model [J].

Cornia, Marcella ;

Baraldi, Lorenzo ;

Serra, Giuseppe ;

Cucchiara, Rita .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (10) :5142-5154

[7]

Gao D., 2008, Advances in Neural Information Processing Systems 20, P497

[8] Discriminant Saliency, the Detection of Suspicious Coincidences, and Applications to Visual Recognition [J].

Gao, Dashan ;

Han, Sunhyoung ;

Vasconcelos, Nuno .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (06) :989-1005

[9] Real-time dense-view imaging for three-dimensional light-field display based on image color calibration and self-supervised view synthesis [J].

Guo, Xiao ;

Sang, Xinzhu ;

Yan, Binbin ;

Wang, Huachun ;

Ye, Xiaoqian ;

Chen, Shuo ;

Wan, Huaming ;

Li, Ningchi ;

Zeng, Zhehao ;

Chen, Duo ;

Wang, Peng ;

Xing, Shujun .

OPTICS EXPRESS, 2022, 30 (12) :22260-22276

[10]

Harel J., 2006, ADV NEURAL INFORM PR, V519

← 1 2 3 →