Self Supervised Progressive Network for High Performance Video Object Segmentation

被引:4
|
作者
Li, Guorong [1 ]
Hong, Dexiang [1 ]
Xu, Kai [1 ]
Zhong, Bineng [2 ]
Su, Li [1 ]
Han, Zhenjun [3 ]
Huang, Qingming [1 ]
机构
[1] Univ ChineseAcademy Sci, Sch Comp Sci & Technol, Beijing 101408, Peoples R China
[2] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China
[3] Univ Chinese Acad Sci UCAS, Sch Elect Elect & Commun Engn, Beijing 101408, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Customer relationship management; Semantics; Object segmentation; Collaboration; Visualization; Decoding; Cycle consistency; self-supervised; similarity learning; video object segmentation (VOS); TRACKING;
D O I
10.1109/TNNLS.2022.3219936
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, self-supervised video object segmentation (VOS) has attracted much interest. However, most proxy tasks are proposed to train only a single backbone, which relies on a point-to-point correspondence strategy to propagate masks through a video sequence. Due to its simple pipeline, the performance of the single backbone paradigm is still unsatisfactory. Instead of following the previous literature, we propose our self-supervised progressive network (SSPNet) which consists of a memory retrieval module (MRM) and collaborative refinement module (CRM). The MRM can perform point-to-point correspondence and produce a propagated coarse mask for a query frame through self-supervised pixel-level and frame-level similarity learning. The CRM, which is trained via cycle consistency region tracking, aggregates the reference & query information and learns the collaborative relationship among them implicitly to refine the coarse mask. Furthermore, to learn semantic knowledge from unlabeled data, we also design two novel mask-generation strategies to provide the training data with meaningful semantic information for the CRM. Extensive experiments conducted on DAVIS-17, YouTube-VOS and SegTrack v2 demonstrate that our method surpasses the state-of-the-art self-supervised methods and narrows the gap with the fully supervised methods.
引用
收藏
页码:7671 / 7684
页数:14
相关论文
共 50 条
  • [21] Scribble-Supervised Video Object Segmentation via Scribble Enhancement
    Gao, Xingyu
    Li, Zuolei
    Shi, Hailong
    Chen, Zhenyu
    Zhao, Peilin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (04) : 2999 - 3012
  • [22] Online self-supervised learning for dynamic object segmentation
    Guizilini, Vitor
    Ramos, Fabio
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2015, 34 (4-5) : 559 - 581
  • [23] Self-Supervised Video-Based Action Recognition With Disturbances
    Lin, Wei
    Ding, Xinghao
    Huang, Yue
    Zeng, Huanqiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 2493 - 2507
  • [24] Progressive Frame-Proposal Mining for Weakly Supervised Video Object Detection
    Han, Mingfei
    Wang, Yali
    Li, Mingjie
    Chang, Xiaojun
    Yang, Yi
    Qiao, Yu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1560 - 1573
  • [25] Research on Video Object Segmentation Algorithm
    Bo, Guan
    PROCEEDINGS OF THE 2015 4TH NATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING ( NCEECE 2015), 2016, 47 : 1399 - 1402
  • [26] Camouflaged Object Segmentation Based on MatchingRecognitionRefinement Network
    Yan, Xinyu
    Sun, Meijun
    Han, Yahong
    Wang, Zheng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 15993 - 16007
  • [27] Temporo-Spatial Parallel Sparse Memory Networks for Efficient Video Object Segmentation
    Dang, Jisheng
    Zheng, Huicheng
    Wang, Bimei
    Wang, Longguang
    Guo, Yulan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (11) : 17291 - 17304
  • [28] Self-Occlusion and Disocclusion in Causal Video Object Segmentation
    Yang, Yanchao
    Sundaramoorthi, Ganesh
    Soatto, Stefano
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4408 - 4416
  • [29] Neural network approach to background Modeling for video object segmentation
    Culibrk, Dubravko
    Marques, Oge
    Socek, Daniel
    Kalva, Hari
    Furht, Borko
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2007, 18 (06): : 1614 - 1627
  • [30] Region Aware Video Object Segmentation With Deep Motion Modeling
    Miao, Bo
    Bennamoun, Mohammed
    Gao, Yongsheng
    Mian, Ajmal
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 2639 - 2651