Self Supervised Progressive Network for High Performance Video Object Segmentation

被引:4
|
作者
Li, Guorong [1 ]
Hong, Dexiang [1 ]
Xu, Kai [1 ]
Zhong, Bineng [2 ]
Su, Li [1 ]
Han, Zhenjun [3 ]
Huang, Qingming [1 ]
机构
[1] Univ ChineseAcademy Sci, Sch Comp Sci & Technol, Beijing 101408, Peoples R China
[2] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China
[3] Univ Chinese Acad Sci UCAS, Sch Elect Elect & Commun Engn, Beijing 101408, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Customer relationship management; Semantics; Object segmentation; Collaboration; Visualization; Decoding; Cycle consistency; self-supervised; similarity learning; video object segmentation (VOS); TRACKING;
D O I
10.1109/TNNLS.2022.3219936
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, self-supervised video object segmentation (VOS) has attracted much interest. However, most proxy tasks are proposed to train only a single backbone, which relies on a point-to-point correspondence strategy to propagate masks through a video sequence. Due to its simple pipeline, the performance of the single backbone paradigm is still unsatisfactory. Instead of following the previous literature, we propose our self-supervised progressive network (SSPNet) which consists of a memory retrieval module (MRM) and collaborative refinement module (CRM). The MRM can perform point-to-point correspondence and produce a propagated coarse mask for a query frame through self-supervised pixel-level and frame-level similarity learning. The CRM, which is trained via cycle consistency region tracking, aggregates the reference & query information and learns the collaborative relationship among them implicitly to refine the coarse mask. Furthermore, to learn semantic knowledge from unlabeled data, we also design two novel mask-generation strategies to provide the training data with meaningful semantic information for the CRM. Extensive experiments conducted on DAVIS-17, YouTube-VOS and SegTrack v2 demonstrate that our method surpasses the state-of-the-art self-supervised methods and narrows the gap with the fully supervised methods.
引用
收藏
页码:7671 / 7684
页数:14
相关论文
共 50 条
  • [31] Breaking the "Object" in Video Object Segmentation
    Tokmakov, Pavel
    Li, Jie
    Gaidon, Adrien
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22836 - 22845
  • [32] ADS-B-Based Spatiotemporal Alignment Network for Airport Video Object Segmentation
    Zhang, Xiang
    Wang, Shuai
    Wu, Honggang
    Liu, Zhi
    Wu, Celimuge
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (10) : 17887 - 17898
  • [33] Semi-supervised Domain Adaptation for Weakly Labeled Semantic Video Object Segmentation
    Wang, Huiling
    Raiko, Tapani
    Lensu, Lasse
    Wang, Tinghuai
    Karhunen, Juha
    COMPUTER VISION - ACCV 2016, PT I, 2017, 10111 : 163 - 179
  • [34] Video object segmentation: A compressed domain approach
    Babu, RV
    Ramakrishnan, KR
    Srinivasan, SH
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2004, 14 (04) : 462 - 474
  • [35] Scalable Video Object Segmentation With Identification Mechanism
    Yang, Zongxin
    Miao, Jiaxu
    Wei, Yunchao
    Wang, Wenguan
    Wang, Xiaohan
    Yang, Yi
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (09) : 6247 - 6262
  • [36] Fully Convolutional Network-Based Self-Supervised Learning for Semantic Segmentation
    Yang, Zhengeng
    Yu, Hongshan
    He, Yong
    Sun, Wei
    Mao, Zhi-Hong
    Mian, Ajmal
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (01) : 132 - 142
  • [37] Performance Evaluation of Various Moving Object Segmentation Techniques for Intelligent Video Surveillance System
    Kushwaha, Alok Kumar Singh
    Srivastava, Rajeev
    2014 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2014, : 196 - 201
  • [38] Siamese Alignment Network for Weakly Supervised Video Moment Retrieval
    Wang, Yunxiao
    Liu, Meng
    Wei, Yinwei
    Cheng, Zhiyong
    Wang, Yinglong
    Nie, Liqiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3921 - 3933
  • [39] Gamifying Video Object Segmentation
    Spampinato, Concetto
    Palazzo, Simone
    Giordano, Daniela
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (10) : 1942 - 1958
  • [40] Hierarchical Video Object Segmentation
    Xing, Junliang
    Ai, Haizhou
    Lao, Shihong
    2011 FIRST ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2011, : 67 - 71