Motion-Aware Memory Network for Fast Video Salient Object Detection

被引：5

作者：

Zhao, Xing ^{[1
]}

Liang, Haoran ^{[1
]}

Li, Peipei ^{[2
]}

Sun, Guodao ^{[1
]}

Zhao, Dongdong ^{[1
]}

Liang, Ronghua ^{[1
]}

He, Xiaofei ^{[1
]}

机构：

[1] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Peoples R China

[2] Zhejiang Univ Technol, Coll Mech Engn, Hangzhou 310023, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2024年 / 33卷

关键词：

Video salient object detection; salient object detection; memory network; feature fusion; OPTIMIZATION;

D O I：

10.1109/TIP.2023.3348659

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Previous methods based on 3DCNN, convLSTM, or optical flow have achieved great success in video salient object detection (VSOD). However, these methods still suffer from high computational costs or poor quality of the generated saliency maps. To address this, we design a space-time memory (STM)-based network that employs a standard encoder-decoder architecture. During the encoding stage, we extract high-level temporal features from the current frame and its adjacent frames, which is more efficient and practical than methods reliant on optical flow. During the decoding stage, we introduce an effective fusion strategy for both spatial and temporal branches. The semantic information of the high-level features is used to improve the object details in the low-level features. Subsequently, spatiotemporal features are methodically derived step by step to reconstruct the saliency maps. Moreover, inspired by the boundary supervision prevalent in image salient object detection (ISOD), we design a motion-aware loss that predicts object boundary motion, and simultaneously perform multitask learning for VSOD and object motion prediction. This can further enhance the model's capability to accurately extract spatiotemporal features while maintaining object integrity. Extensive experiments on several datasets demonstrate the effectiveness of our method and can achieve state-of-the-art metrics on some datasets. Our proposed model does not require optical flow or additional preprocessing, and can reach an impressive inference speed of nearly 100 FPS.

引用

页码：709 / 721

页数：13

共 50 条

[1] Motion-Aware Rapid Video Saliency Detection
Guo, Fang
Wang, Wenguan
Shen, Ziyi
Shen, Jianbing
Shao, Ling
Tao, Dacheng
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (12) : 4887 - 4898
[2] Spatiotemporal context-aware network for video salient object detection
Chen, Tianyou
Xiao, Jin
Hu, Xiaoguang
Zhang, Guofeng
Wang, Shaojie
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (19) : 16861 - 16877
[3] Spatiotemporal context-aware network for video salient object detection
Tianyou Chen
Jin Xiao
Xiaoguang Hu
Guofeng Zhang
Shaojie Wang
Neural Computing and Applications, 2022, 34 : 16861 - 16877
[4] IENet: inheritance enhancement network for video salient object detection
Jiang, Tao
Wang, Yi
Hou, Feng
Wang, Ruili
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (28) : 72007 - 72026
[5] Motion Context guided Edge-preserving network for video salient object detection
Huang, Kan
Tian, Chunwei
Xu, Zhijing
Li, Nannan
Lin, Jerry Chun-Wei
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 233
[6] GUIDANCE AND TEACHING NETWORK FOR VIDEO SALIENT OBJECT DETECTION
Jiao, Yingxia
Wang, Xiao
Chou, Yu-Cheng
Yang, Shouyuan
Ji, Ge-Peng
Zhu, Rong
Gao, Ge
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2199 - 2203
[7] Part-aware attention correctness for video salient object detection
Liu, Ze-yu
Liu, Jian-wei
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 119
[8] Motion-Aware Temporal Coherence for Video Resizing
Wang, Yu-Shuen
Fu, Hongbo
Sorkine, Olga
Lee, Tong-Yee
Seidel, Hans-Peter
ACM TRANSACTIONS ON GRAPHICS, 2009, 28 (05): : 1 - 10
[9] Pyramid Constrained Self-Attention Network for Fast Video Salient Object Detection
Gu, Yuchao
Wang, Lijuan
Wang, Ziqin
Liu, Yun
Cheng, Ming-Ming
Lu, Shao-Ping
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10869 - 10876
[10] Optical Flow Guided Pyramid Network for Video Salient Object Detection
Tang, Tinglong
Hua, Sheng
Sun, Shuifa
Wu, Yirong
Zhu, Yuqi
Yue, Chonghao
PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 723 - 728

← 1 2 3 4 5 →