Spatio-temporal compression for semi-supervised video object segmentation

被引:0
|
作者
Ji, Chuanjun [1 ,2 ]
Chen, Yadang [1 ,2 ]
Yang, Zhi-Xin [3 ,4 ]
Wu, Enhua [5 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Comp & Software, Nanjing 210044, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, Engn Res Ctr Digital Forens, Minist Educ, Nanjing 210044, Peoples R China
[3] Univ Macau, State Key Lab Internet Things Smart City, Macau 999078, Peoples R China
[4] Univ Macau, Dept Electromech Engn, Macau 999078, Peoples R China
[5] Univ Chinese Acad Sci, Inst Software, State Key Lab Comp Sci, Beijing 100190, Peoples R China
来源
VISUAL COMPUTER | 2023年 / 39卷 / 10期
基金
中国国家自然科学基金;
关键词
Video object segmentation; External memory; Spatial-temporal redundancy; Memory reading;
D O I
10.1007/s00371-022-02638-4
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we explore the spatial-temporal redundancy in video object segmentation (VOS) under semi-supervised context with the purpose to improve the computational efficiency. Recently, memory-based methods have attracted great attention for their excellent performance. These methods involve first constructing an external memory to store the target object information in the history frames and then selecting the information that is beneficial for modeling the target object by memory reading. However, such methods are inefficient and unable to achieve both high accuracy and high efficiency, due to the large amount of redundant information in memory. Moreover, they periodically sample historical frames and add them to memory; this operation may lose important information from dynamic frames with incremental object changing or aggravate temporal redundancy from static frames with no object changing. To address these problems, we propose an efficient semi-supervised VOS approach via spatio-temporal compression (termed as STCVOS). Specifically, we first adopt a temporally varying sensor to adaptively filter static frames with no target objects evolutions and trigger memory to update with frames containing noticeable variations. Furthermore, we propose a spatially compressed memory to absorb features with varied pixels and remove outdated features, which considerably reduces information redundancy. More importantly, we introduce an efficient memory reader to perform memory reading with less footprint and computational overhead. Experimental results indicate that STCVOS performs well against state-of-the-art methods on the DAVIS 2017 and YouTube-VOS datasets, with a J&F overall score of 82.0% and 79.7%, respectively. Meanwhile, STCVOS achieves a high inference speed of approximately 30 FPS.
引用
收藏
页码:4929 / 4942
页数:14
相关论文
共 50 条
  • [1] Spatio-temporal compression for semi-supervised video object segmentation
    Chuanjun Ji
    Yadang Chen
    Zhi-Xin Yang
    Enhua Wu
    The Visual Computer, 2023, 39 : 4929 - 4942
  • [2] Learning from Spatio-temporal Correlation for Semi-Supervised LiDAR Semantic Segmentation
    Lee, Seungho
    Lee, Hwijeong
    Shim, Hyunjung
    arXiv,
  • [3] Semi-supervised Video Deraining Based on Enhanced Spatio-Temporal Interaction Network
    Jiang, Juhao
    Yang, Bin
    Chen, Guannan
    2022 2ND IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND ARTIFICIAL INTELLIGENCE (SEAI 2022), 2022, : 99 - 103
  • [4] A Novel Spatio-Temporal Video Object Segmentation Algorithm
    Zhu, Shiping
    Xia, Xi
    Zhang, Qingrong
    Belloulata, Kamel
    2008 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY, VOLS 1-5, 2008, : 1916 - +
  • [5] Efficient probabilistic spatio-temporal video object segmentation
    Ahmed, Rakib
    Karmakar, Gour C.
    Dooley, Laurence S.
    6TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE, PROCEEDINGS, 2007, : 807 - +
  • [6] A spatio-temporal video analysis system for object segmentation
    Xia, JH
    Wang, YL
    ISPA 2003: PROCEEDINGS OF THE 3RD INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS, PTS 1 AND 2, 2003, : 812 - 815
  • [7] Semi-supervised spatio-temporal CNN for recognition of surgical workflow
    Yuwen Chen
    Qi Long Sun
    Kunhua Zhong
    EURASIP Journal on Image and Video Processing, 2018
  • [8] Semi-supervised spatio-temporal CNN for recognition of surgical workflow
    Chen, Yuwen
    Sun, Qi Long
    Zhong, Kunhua
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2018,
  • [9] Vanishing mask refinement in semi-supervised video object segmentation
    Pita, Javier
    Llerena, Juan P.
    Patricio, Miguel A.
    Berlanga, Antonio
    Usero, Luis
    APPLIED SOFT COMPUTING, 2025, 172
  • [10] Semi-Supervised Video Object Segmentation with Super-Trajectories
    Wang, Wenguan
    Shen, Jianbing
    Porikli, Fatih
    Yang, Ruigang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (04) : 985 - 998