Full-Duplex Strategy for Video Object Segmentation

被引:111
作者
Ji, Ge-Peng [1 ,2 ]
Fu, Keren [3 ]
Wu, Zhe [4 ]
Fan, Deng-Ping [1 ]
Shen, Jianbing [5 ]
Shao, Ling [1 ]
机构
[1] IIAI, Hong Kong, Peoples R China
[2] Wuhan Univ, Sch CS, Wuhan, Peoples R China
[3] Sichuan Univ, Coll CS, Chengdu, Peoples R China
[4] Peng Cheng Lab, Chengdu, Peoples R China
[5] Univ Macau, Dept CIS, Zhuhai, Peoples R China
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
基金
中国博士后科学基金;
关键词
SALIENCY DETECTION; OPTIMIZATION;
D O I
10.1109/ICCV48922.2021.00488
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Appearance and motion are two important sources of information in video object segmentation (VOS). Previous methods mainly focus on using simplex solutions, lowering the upper bound of feature collaboration among and across these two cues. In this paper, we study a novel framework, termed the FSNet (Full-duplex Strategy Network), which designs a relational cross-attention module (RCAM) to achieve the bidirectional message propagation across embedding subspaces. Furthermore, the bidirectional purification module (BPM) is introduced to update the inconsistent features between the spatial-temporal embeddings, effectively improving the model robustness. By considering the mutual restraint within the full-duplex strategy, our FSNet performs the cross-modal feature-passing (i.e., transmission and receiving) simultaneously before the fusion and decoding stage, making it robust to various challenging scenarios (e.g., motion blur, occlusion) in VOS. Extensive experiments on five popular benchmarks (i.e., DAVIS16, FBMS, MCL, SegTrack-V2, and DAVSOD19) show that our FSNet outperforms other state-of-the-arts for both the VOS and video salient object detection tasks.
引用
收藏
页码:4902 / 4913
页数:12
相关论文
共 129 条
[71]   Fast object segmentation in unconstrained video [J].
Papazoglou, Anestis ;
Ferrari, Vittorio .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :1777-1784
[72]  
Paszke A, 2019, ADV NEUR IN, V32
[73]  
PeisongWen Ruolin Yang, 2020, ACM MM
[74]   Automatic Video Object Segmentation Based on Visual and Motion Saliency [J].
Peng, Qinmu ;
Cheung, Yiu-Ming .
IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (12) :3083-3094
[75]   A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation [J].
Perazzi, F. ;
Pont-Tuset, J. ;
McWilliams, B. ;
Van Gool, L. ;
Gross, M. ;
Sorkine-Hornung, A. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :724-732
[76]   Fully Connected Object Proposals for Video Segmentation [J].
Perazzi, Federico ;
Wang, Oliver ;
Gross, Markus ;
Sorkine-Hornung, Alexander .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :3227-3234
[77]  
Perazzi F, 2012, PROC CVPR IEEE, P733, DOI 10.1109/CVPR.2012.6247743
[78]  
Qin XL, 2017, 2017 IEEE INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT (ICPHM), P1, DOI [10.1109/ATNAC.2017.8215431, 10.1109/ICPHM.2017.7998297]
[79]  
Robinson Andreas, 2020, IEEE CVPR
[80]   U-Net: Convolutional Networks for Biomedical Image Segmentation [J].
Ronneberger, Olaf ;
Fischer, Philipp ;
Brox, Thomas .
MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 :234-241