Motion-Aware Memory Network for Fast Video Salient Object Detection

被引:5
|
作者
Zhao, Xing [1 ]
Liang, Haoran [1 ]
Li, Peipei [2 ]
Sun, Guodao [1 ]
Zhao, Dongdong [1 ]
Liang, Ronghua [1 ]
He, Xiaofei [1 ]
机构
[1] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ Technol, Coll Mech Engn, Hangzhou 310023, Peoples R China
关键词
Video salient object detection; salient object detection; memory network; feature fusion; OPTIMIZATION;
D O I
10.1109/TIP.2023.3348659
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous methods based on 3DCNN, convLSTM, or optical flow have achieved great success in video salient object detection (VSOD). However, these methods still suffer from high computational costs or poor quality of the generated saliency maps. To address this, we design a space-time memory (STM)-based network that employs a standard encoder-decoder architecture. During the encoding stage, we extract high-level temporal features from the current frame and its adjacent frames, which is more efficient and practical than methods reliant on optical flow. During the decoding stage, we introduce an effective fusion strategy for both spatial and temporal branches. The semantic information of the high-level features is used to improve the object details in the low-level features. Subsequently, spatiotemporal features are methodically derived step by step to reconstruct the saliency maps. Moreover, inspired by the boundary supervision prevalent in image salient object detection (ISOD), we design a motion-aware loss that predicts object boundary motion, and simultaneously perform multitask learning for VSOD and object motion prediction. This can further enhance the model's capability to accurately extract spatiotemporal features while maintaining object integrity. Extensive experiments on several datasets demonstrate the effectiveness of our method and can achieve state-of-the-art metrics on some datasets. Our proposed model does not require optical flow or additional preprocessing, and can reach an impressive inference speed of nearly 100 FPS.
引用
收藏
页码:709 / 721
页数:13
相关论文
共 50 条
  • [31] A semi-supervised recurrent neural network for video salient object detection
    Aditya Kompella
    Raghavendra V. Kulkarni
    Neural Computing and Applications, 2021, 33 : 2065 - 2083
  • [32] Attention Embedded Spatio-Temporal Network for Video Salient Object Detection
    Huang, Lili
    Yan, Pengxiang
    Li, Guanbin
    Wang, Qing
    Lin, Liang
    IEEE ACCESS, 2019, 7 : 166203 - 166213
  • [33] Video Salient Object Detection via Fully Convolutional Networks
    Wang, Wenguan
    Shen, Jianbing
    Shao, Ling
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (01) : 38 - 49
  • [34] Transformer-based Cross Reference Network for video salient object detection
    Huang, Kan
    Tian, Chunwei
    Su, Jingyong
    Lin, Jerry Chun-Wei
    PATTERN RECOGNITION LETTERS, 2022, 160 : 122 - 127
  • [35] CASNet: A Cross-Attention Siamese Network for Video Salient Object Detection
    Ji, Yuzhu
    Zhang, Haijun
    Jie, Zequn
    Ma, Lin
    Wu, Q. M. Jonathan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2676 - 2690
  • [36] Video Salient Object Detection Using Spatiotemporal Deep Features
    Trung-Nghia Le
    Sugimoto, Akihiro
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (10) : 5002 - 5015
  • [37] Dynamic Context-Sensitive Filtering Network for Video Salient Object Detection
    Zhang, Miao
    Liu, Jie
    Wang, Yifei
    Piao, Yongri
    Yao, Shunyu
    Ji, Wei
    Li, Jingjing
    Lu, Huchuan
    Luo, Zhongxuan
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1533 - 1543
  • [38] A semi-supervised recurrent neural network for video salient object detection
    Kompella, Aditya
    Kulkarni, Raghavendra, V
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (06) : 2065 - 2083
  • [39] A novel spatiotemporal attention enhanced discriminative network for video salient object detection
    Bing Liu
    Kezhou Mu
    Mingzhu Xu
    Fangyuan Wang
    Lei Feng
    Applied Intelligence, 2022, 52 : 5922 - 5937
  • [40] BENet: Boundary Enhance Network for Salient Object Detection
    Yan, Zhiqi
    Liang, Shuang
    MULTIMEDIA MODELING, MMM 2023, PT II, 2023, 13834 : 228 - 239