Motion-Aware Memory Network for Fast Video Salient Object Detection

被引:5
|
作者
Zhao, Xing [1 ]
Liang, Haoran [1 ]
Li, Peipei [2 ]
Sun, Guodao [1 ]
Zhao, Dongdong [1 ]
Liang, Ronghua [1 ]
He, Xiaofei [1 ]
机构
[1] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ Technol, Coll Mech Engn, Hangzhou 310023, Peoples R China
关键词
Video salient object detection; salient object detection; memory network; feature fusion; OPTIMIZATION;
D O I
10.1109/TIP.2023.3348659
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous methods based on 3DCNN, convLSTM, or optical flow have achieved great success in video salient object detection (VSOD). However, these methods still suffer from high computational costs or poor quality of the generated saliency maps. To address this, we design a space-time memory (STM)-based network that employs a standard encoder-decoder architecture. During the encoding stage, we extract high-level temporal features from the current frame and its adjacent frames, which is more efficient and practical than methods reliant on optical flow. During the decoding stage, we introduce an effective fusion strategy for both spatial and temporal branches. The semantic information of the high-level features is used to improve the object details in the low-level features. Subsequently, spatiotemporal features are methodically derived step by step to reconstruct the saliency maps. Moreover, inspired by the boundary supervision prevalent in image salient object detection (ISOD), we design a motion-aware loss that predicts object boundary motion, and simultaneously perform multitask learning for VSOD and object motion prediction. This can further enhance the model's capability to accurately extract spatiotemporal features while maintaining object integrity. Extensive experiments on several datasets demonstrate the effectiveness of our method and can achieve state-of-the-art metrics on some datasets. Our proposed model does not require optical flow or additional preprocessing, and can reach an impressive inference speed of nearly 100 FPS.
引用
收藏
页码:709 / 721
页数:13
相关论文
共 50 条
  • [21] Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
    Zhou, Xiaofei
    Gao, Hanxiao
    Yu, Longxuan
    Yang, Defu
    Zhang, Jiyong
    ELECTRONICS, 2023, 12 (03)
  • [22] UDNet: Uncertainty-aware deep network for salient object detection
    Fang, Yuming
    Zhang, Haiyan
    Yan, Jiebin
    Jiang, Wenhui
    Liu, Yang
    PATTERN RECOGNITION, 2023, 134
  • [23] Ranking Video Salient Object Detection
    Wang, Zheng
    Yan, Xinyu
    Han, Yahong
    Sun, Meijun
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 873 - 881
  • [24] Mutual-Guidance Transformer-Embedding Network for Video Salient Object Detection
    Min, Dingyao
    Zhang, Chao
    Lu, Yukang
    Fu, Keren
    Zhao, Qijun
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1674 - 1678
  • [25] DEPTH-COOPERATED TRIMODAL NETWORK FOR VIDEO SALIENT OBJECT DETECTION
    Lu, Yukang
    Min, Dingyao
    Fu, Keren
    Zhao, Qijun
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 116 - 120
  • [26] Hierarchical boundary feature alignment network for video salient object detection
    Mao, Amin
    Yan, Jiebin
    Fang, Yuming
    Liu, Hantao
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2025, 109
  • [27] Local and Global Feature Aggregation-Aware Network for Salient Object Detection
    Da, Zikai
    Gao, Yu
    Xue, Zihan
    Cao, Jing
    Wang, Peizhen
    ELECTRONICS, 2022, 11 (02)
  • [28] Contour-Aware Recurrent Cross Constraint Network for Salient Object Detection
    Yao, Cuili
    Kong, Yuqiu
    Feng, Lin
    Jin, Bo
    Si, Hui
    IEEE ACCESS, 2020, 8 (08): : 218739 - 218751
  • [29] Edge-aware salient object detection network via context guidance
    Chen, Xiaowei
    Zhang, Qing
    Zhang, Liqian
    IMAGE AND VISION COMPUTING, 2021, 110
  • [30] STA-Net: spatial-temporal attention network for video salient object detection
    Bi, Hong-Bo
    Lu, Di
    Zhu, Hui-Hui
    Yang, Li-Na
    Guan, Hua-Ping
    APPLIED INTELLIGENCE, 2021, 51 (06) : 3450 - 3459