Saliency prediction on omnidirectional images with attention-aware feature fusion network

被引:1
作者
Dandan Zhu
Yongqing Chen
Defang Zhao
Qiangqiang Zhou
Xiaokang Yang
机构
[1] Shanghai Jiao Tong University,Artificial Intelligence Institute
[2] Hainan Air Traffic Management Sub-Bureau,School of Software Engineering
[3] Tongji University,School of Information and Computer
[4] Shanghai Business School,undefined
来源
Applied Intelligence | 2021年 / 51卷
关键词
Attention-aware features; Visibility score; Omnidirectional image; Saliency prediction;
D O I
暂无
中图分类号
学科分类号
摘要
Recent years have witnessed rapid development of deep learning technology and its successful application in the saliency prediction of traditional 2D images. However, when using deep neural network (DNN) models to perform saliency prediction on omnidirectional images (ODIs), there are two critical issues: (1) The datasets for ODIs are small-scale that cannot support the training DNN-based models. (2) It is challenging to perform saliency prediction in that some ODIs contain complex background clutters. In order to solve these two problems, we propose a novel Attention-Aware Features Fusion Network (AAFFN) model which is first trained with traditional 2D images and then transferred to the ODIs for saliency prediction. Specifically, our proposed AAFFN model consists of three modules: a Part-guided Attention (PA) module, a Visibility Score (VS) module, and a Attention-Aware Features Fusion (AAFF) module. The PA module is used to extract precise features to estimate attention of the finer part on ODIs, and eliminate the influence of cluttered background. Meanwhile, the VS module is introduced to measure the proportion of the foreground and background parts and generate visibility scores in the feature learning process. Finally, in the AAFF module, we utilize the weighted fusion of attention maps and visibility scores to generate the final saliency map. Extensive experiments and ablation analysis demonstrate that the proposed model achieves superior performance and outperforms other state-of-the-art methods on public benchmark datasets.
引用
收藏
页码:5344 / 5357
页数:13
相关论文
共 47 条
[1]  
Battisti F(2018)A feature-based approach for saliency estimation of omni-directional images Signal Process: Image Commun 69 53-59
[2]  
Baldoni S(2018)Predicting human eye fixations via an lstm-based saliency attentive model IEEE Trans Image Process 27 5142-5154
[3]  
Brizzi M(1998)A model of saliency-based visual attention for rapid scene analysis IEEE Trans Pattern Anal Mach Intell 20 1254-1259
[4]  
Carli M(2017)Deepfix: a fully convolutional neural network for predicting human eye fixations IEEE Trans Image Process 26 4446-4456
[5]  
Cornia M(2007)Predicting visual fixations on video based on low-level visual features Vis Res 47 2483-2498
[6]  
Baraldi L(2018)Gbvs360, bms360, prosal: extending existing saliency prediction models from 2d to omnidirectional images Signal Process: Image Commun 69 69-78
[7]  
Serra G(2018)Salnet360: saliency maps for omni-directional images with cnn Signal Process: Image Commun 69 26-34
[8]  
Cucchiara R(2005)Components of bottom-up gaze allocation in natural images Vis Res 45 2397-2416
[9]  
Itti L(2018)Saliency in vr: how do people explore virtual environments? IEEE Trans Visual Comput Graph 24 1633-1642
[10]  
Koch C(2018)360-aware saliency estimation with conventional image saliency predictors Signal Process: Image Commun 69 43-52