Motion-Aware Memory Network for Fast Video Salient Object Detection

被引:5
|
作者
Zhao, Xing [1 ]
Liang, Haoran [1 ]
Li, Peipei [2 ]
Sun, Guodao [1 ]
Zhao, Dongdong [1 ]
Liang, Ronghua [1 ]
He, Xiaofei [1 ]
机构
[1] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ Technol, Coll Mech Engn, Hangzhou 310023, Peoples R China
关键词
Video salient object detection; salient object detection; memory network; feature fusion; OPTIMIZATION;
D O I
10.1109/TIP.2023.3348659
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous methods based on 3DCNN, convLSTM, or optical flow have achieved great success in video salient object detection (VSOD). However, these methods still suffer from high computational costs or poor quality of the generated saliency maps. To address this, we design a space-time memory (STM)-based network that employs a standard encoder-decoder architecture. During the encoding stage, we extract high-level temporal features from the current frame and its adjacent frames, which is more efficient and practical than methods reliant on optical flow. During the decoding stage, we introduce an effective fusion strategy for both spatial and temporal branches. The semantic information of the high-level features is used to improve the object details in the low-level features. Subsequently, spatiotemporal features are methodically derived step by step to reconstruct the saliency maps. Moreover, inspired by the boundary supervision prevalent in image salient object detection (ISOD), we design a motion-aware loss that predicts object boundary motion, and simultaneously perform multitask learning for VSOD and object motion prediction. This can further enhance the model's capability to accurately extract spatiotemporal features while maintaining object integrity. Extensive experiments on several datasets demonstrate the effectiveness of our method and can achieve state-of-the-art metrics on some datasets. Our proposed model does not require optical flow or additional preprocessing, and can reach an impressive inference speed of nearly 100 FPS.
引用
收藏
页码:709 / 721
页数:13
相关论文
共 50 条
  • [41] Transformers and CNNs fusion network for salient object detection
    Yao, Cuili
    Feng, Lin
    Kong, Yuqiu
    Xiao, Lin
    Chen, Tao
    NEUROCOMPUTING, 2023, 520 : 342 - 355
  • [42] FANet: focus-aware lightweight light field salient object detection network
    Fu, Jiamin
    Chen, Zhihong
    Zhang, Haiwei
    Gao, Yuxuan
    Xu, Haitao
    Zhang, Hao
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2025, 22 (01)
  • [43] Interactive context-aware network for RGB-T salient object detection
    Wang, Yuxuan
    Dong, Feng
    Zhu, Jinchao
    Chen, Jianren
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (28) : 72153 - 72174
  • [44] Global and local information aggregation network for edge-aware salient object detection
    Zhang, Qing
    Zhang, Liqian
    Wang, Dong
    Shi, Yanjiao
    Lin, Jiajun
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 81
  • [45] BPFINet: Boundary-aware progressive feature integration network for salient object detection
    Chen, Tianyou
    Hu, Xiaoguang
    Xiao, Jin
    Zhang, Guofeng
    NEUROCOMPUTING, 2021, 451 : 152 - 166
  • [46] Fast Video Saliency Detection via Maximally Stable Region Motion and Object Repeatability
    Huang, Xiaoming
    Zhang, Yu-Jin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4458 - 4470
  • [47] CANet: Context-aware Aggregation Network for Salient Object Detection of Surface Defects*
    Wan, Bin
    Zhou, Xiaofei
    Zhu, Bin
    Xiao, Mang
    Sun, Yaoqi
    Zheng, Bolun
    Zhang, Jiyong
    Yan, Chenggang
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 93
  • [48] Selective feature fusion network for salient object detection
    Sun, Fengming
    Yuan, Xia
    Zhao, Chunxia
    IET COMPUTER VISION, 2023, 17 (04) : 483 - 495
  • [49] Fast Salient Object Detection Based on Segments
    Zhuang, Liansheng
    Tang, Ketan
    Yu, Nenghai
    Qian, Yangchun
    2009 INTERNATIONAL CONFERENCE ON MEASURING TECHNOLOGY AND MECHATRONICS AUTOMATION, VOL I, 2009, : 469 - 472
  • [50] Global-aware Interaction Network for RGB-D salient object detection
    Jiang, Zijian
    Yu, Ling
    Han, Yu
    Li, Junru
    Niu, Fanglin
    NEUROCOMPUTING, 2025, 621