An efficient saliency prediction model for Unmanned Aerial Vehicle video

被引:2
作者
Zhang, Kao [1 ]
Chen, Zhenzhong [1 ]
Li, Songnan [2 ]
Liu, Shan [2 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
[2] Tencent Media Lab, Shenzhen 518057, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Visual saliency; UAV video analysis; Spatial-temporal features; Prior information; VISUAL-ATTENTION; SPATIOTEMPORAL SALIENCY; OBJECT DETECTION; NEURAL-NETWORK; TARGETS;
D O I
10.1016/j.isprsjprs.2022.10.008
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
Visual saliency prediction plays an important role in Unmanned Aerial Vehicle (UAV) video analysis tasks. In this paper, an efficient saliency prediction model for UAV video is proposed based on spatial-temporal features, prior information and the relationship of frames. It can achieve high efficiency by designing a simplified network model. Since UAV videos usually cover a wide range of scenes containing various background disturbances, a cascading architecture module is proposed for feature extraction from coarse to fine, in which a saliency related feature sub-network is utilized to obtain useful clues from each frame, then a new convolution block is designed to capture spatial-temporal features. This structure can achieve advanced performance and high speed based on a 2D CNN framework. Moreover, a multi-stream prior module is proposed to model the bias phenomenon in viewing behavior for UAV video scenes. It can automatically learn prior information based on the video context, and can also combine other priors. Finally, based on the spatial-temporal features and learned priors, a temporal weighted average module is proposed to model the inter-frame relationship and generate the final saliency map, which can make the generated saliency maps look smoother in the temporal dimension. The proposed method is compared with 17 state-of-the-art models on two public UAV video saliency prediction datasets. The experimental results demonstrate that our model outperforms other competitors. Source code is available at: https://github.com/zhangkao/IIP_UAVSal_Saliency.
引用
收藏
页码:152 / 166
页数:15
相关论文
共 78 条
[1]   Spatio-Temporal Saliency Networks for Dynamic Saliency Prediction [J].
Bak, Cagdas ;
Kocak, Aysun ;
Erdem, Erkut ;
Erdem, Aykut .
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (07) :1688-1698
[2]   Hierarchical Domain-Adapted Feature Learning for Video Saliency Prediction [J].
Bellitto, G. ;
Proietto Salanitri, F. ;
Palazzo, S. ;
Rundo, F. ;
Giordano, D. ;
Spampinato, C. .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (12) :3216-3232
[3]   Saliency Prediction in the Deep Learning Era: Successes and Limitations [J].
Borji, Ali .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) :679-700
[4]  
Borji A, 2012, PROC CVPR IEEE, P438, DOI 10.1109/CVPR.2012.6247706
[5]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[6]   RRNet: Relational Reasoning Network With Parallel Multiscale Attention for Salient Object Detection in Optical Remote Sensing Images [J].
Cong, Runmin ;
Zhang, Yumo ;
Fang, Leyuan ;
Li, Jun ;
Zhao, Yao ;
Kwong, Sam .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[7]   Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model [J].
Cornia, Marcella ;
Baraldi, Lorenzo ;
Serra, Giuseppe ;
Cucchiara, Rita .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (10) :5142-5154
[8]  
Cornia M, 2016, INT C PATT RECOG, P3488, DOI 10.1109/ICPR.2016.7900174
[9]   Unified Image and Video Saliency Modeling [J].
Droste, Richard ;
Jiao, Jianbo ;
Noble, J. Alison .
COMPUTER VISION - ECCV 2020, PT V, 2020, 12350 :419-435
[10]   Shifting More Attention to Video Salient Object Detection [J].
Fan, Deng-Ping ;
Wang, Wenguan ;
Cheng, Ming-Ming ;
Shen, Jianbing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8546-8556