Global Spectral Filter Memory Network for Video Object Segmentation

被引：29

作者：

Liu, Yong ^{[1
,2
]}

Yu, Ran ^{[1
]}

Wang, Jiahao ^{[1
]}

Zhao, Xinyuan ^{[3
]}

Wang, Yitong ^{[2
]}

Tang, Yansong ^{[1
]}

Yang, Yujiu ^{[1
]}

机构：

[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Beijing, Peoples R China

[2] ByteDance Inc, Beijing, Peoples R China

[3] Northwestern Univ, Evanston, IL USA

来源：

COMPUTER VISION, ECCV 2022, PT XXIX | 2022年 / 13689卷

基金：

中国国家自然科学基金;

关键词：

Video object segmentation; Spectral domain;

D O I：

10.1007/978-3-031-19818-2_37

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper studies semi-supervised video object segmentation through boosting intra-frame interaction. Recent memory network-based methods focus on exploiting inter-frame temporal reference while paying little attention to intra-frame spatial dependency. Specifically, these segmentation model tends to be susceptible to interference from unrelated nontarget objects in a certain frame. To this end, we propose Global Spectral Filter Memory network (GSFM), which improves intraframe interaction through learning long-term spatial dependencies in the spectral domain. The key components of GSFM is 2D (inverse) discrete Fourier transform for spatial information mixing. Besides, we empirically find low frequency feature should be enhanced in encoder (backbone) while high frequency for decoder (segmentation head). We attribute this to semantic information extracting role for encoder and fine-grained details highlighting role for decoder. Thus, Low (High) Frequency Module is proposed to fit this circumstance. Extensive experiments on the popular DAVIS and YouTube-VOS benchmarks demonstrate that GSFM noticeably outperforms the baseline method and achieves state-of-the-art performance. Besides, extensive analysis shows that the proposed modules are reasonable and of great generalization ability.

引用

页码：648 / 665

页数：18

共 70 条

[1] CNN in MRF: Video Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF [J].

Bao, Linchao ;

Wu, Baoyuan ;

Liu, Wei .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5977-5986

[2] A GUIDED TOUR OF FAST FOURIER TRANSFORM [J].

BERGLAND, GD .

IEEE SPECTRUM, 1969, 6 (07) :41-+

[3] One-Shot Video Object Segmentation [J].

Caelles, S. ;

Maninis, K. -K. ;

Pont-Tuset, J. ;

Leal-Taixe, L. ;

Cremers, D. ;

Van Gool, L. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5320-5329

[4] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion [J].

Cheng, Ho Kei ;

Tai, Yu-Wing ;

Tang, Chi-Keung .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :5555-5564

[5] CascadePSP: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement [J].

Cheng, Ho Kei ;

Chung, Jihoon ;

Tai, Yu-Wing ;

Tang, Chi-Keung .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8887-8896

[6] Fast and Accurate Online Video Object Segmentation via Tracking Parts [J].

Cheng, Jingchun ;

Tsai, Yi-Hsuan ;

Hung, Wei-Chih ;

Wang, Shengjin ;

Yang, Ming-Hsuan .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7415-7424

[7] SegFlow: Joint Learning for Video Object Segmentation and Optical Flow [J].

Cheng, Jingchun ;

Tsai, Yi-Hsuan ;

Wang, Shengjin ;

Yang, Ming-Hsuan .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :686-695

[8] Boundary-Preserving Mask R-CNN [J].

Cheng, Tianheng ;

Wang, Xinggang ;

Huang, Lichao ;

Liu, Wenyu .

COMPUTER VISION - ECCV 2020, PT XIV, 2020, 12359 :660-676

[9] SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation [J].

Duke, Brendan ;

Ahmed, Abdalla ;

Wolf, Christian ;

Aarabi, Parham ;

Taylor, Graham W. .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :5908-5917

[10] Video Object Segmentation Using Global and Instance Embedding Learning [J].

Ge, Wenbin ;

Lu, Xiankai ;

Shen, Jianbing .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :16831-16840

← 1 2 3 4 5 6 7 →