Spatio-temporal compression for semi-supervised video object segmentation

被引:0
|
作者
Ji, Chuanjun [1 ,2 ]
Chen, Yadang [1 ,2 ]
Yang, Zhi-Xin [3 ,4 ]
Wu, Enhua [5 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Comp & Software, Nanjing 210044, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, Engn Res Ctr Digital Forens, Minist Educ, Nanjing 210044, Peoples R China
[3] Univ Macau, State Key Lab Internet Things Smart City, Macau 999078, Peoples R China
[4] Univ Macau, Dept Electromech Engn, Macau 999078, Peoples R China
[5] Univ Chinese Acad Sci, Inst Software, State Key Lab Comp Sci, Beijing 100190, Peoples R China
来源
VISUAL COMPUTER | 2023年 / 39卷 / 10期
基金
中国国家自然科学基金;
关键词
Video object segmentation; External memory; Spatial-temporal redundancy; Memory reading;
D O I
10.1007/s00371-022-02638-4
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we explore the spatial-temporal redundancy in video object segmentation (VOS) under semi-supervised context with the purpose to improve the computational efficiency. Recently, memory-based methods have attracted great attention for their excellent performance. These methods involve first constructing an external memory to store the target object information in the history frames and then selecting the information that is beneficial for modeling the target object by memory reading. However, such methods are inefficient and unable to achieve both high accuracy and high efficiency, due to the large amount of redundant information in memory. Moreover, they periodically sample historical frames and add them to memory; this operation may lose important information from dynamic frames with incremental object changing or aggravate temporal redundancy from static frames with no object changing. To address these problems, we propose an efficient semi-supervised VOS approach via spatio-temporal compression (termed as STCVOS). Specifically, we first adopt a temporally varying sensor to adaptively filter static frames with no target objects evolutions and trigger memory to update with frames containing noticeable variations. Furthermore, we propose a spatially compressed memory to absorb features with varied pixels and remove outdated features, which considerably reduces information redundancy. More importantly, we introduce an efficient memory reader to perform memory reading with less footprint and computational overhead. Experimental results indicate that STCVOS performs well against state-of-the-art methods on the DAVIS 2017 and YouTube-VOS datasets, with a J&F overall score of 82.0% and 79.7%, respectively. Meanwhile, STCVOS achieves a high inference speed of approximately 30 FPS.
引用
收藏
页码:4929 / 4942
页数:14
相关论文
共 50 条
  • [31] Video Segmentation with Spatio-Temporal Tubes
    Trichet, Remi
    Nevatia, Ramakant
    2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 330 - 335
  • [32] SMATE: Semi-Supervised Spatio-Temporal Representation Learning on Multivariate Time Series
    Zuo, Jingwei
    Zeitouni, Karine
    Taher, Yehia
    2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 1565 - 1570
  • [33] Spatio-temporal segmentation for video surveillance
    Sun, HZ
    Feng, T
    Tan, TN
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS: COMPUTER VISION AND IMAGE ANALYSIS, 2000, : 843 - 846
  • [34] Exploring the Semi-Supervised Video Object Segmentation Problem from a Cyclic Perspective
    Yuxi Li
    Ning Xu
    Wenjie Yang
    John See
    Weiyao Lin
    International Journal of Computer Vision, 2022, 130 : 2408 - 2424
  • [35] Weakly-supervised video object localization with attentive spatio-temporal correlation
    Wang, Mingui
    Cui, Di
    Wu, Lifang
    Jian, Meng
    Chen, Yukun
    Wang, Dong
    Liu, Xu
    PATTERN RECOGNITION LETTERS, 2021, 145 : 232 - 239
  • [36] MUNet: Motion uncertainty-aware semi-supervised video object segmentation
    Sun, Jiadai
    Mao, Yuxin
    Dai, Yuchao
    Zhong, Yiran
    Wang, Jianyuan
    PATTERN RECOGNITION, 2023, 138
  • [37] Weakly-Supervised Video Object Grounding by Exploring Spatio-Temporal Contexts
    Yang, Xun
    Liu, Xueliang
    Jian, Meng
    Gao, Xinjian
    Wang, Meng
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1939 - 1947
  • [38] A Semi-supervised Video Object Segmentation Method Based on Adaptive Memory Module
    Yang, Shaohua
    Luo, Zhiming
    Cao, Donglin
    Lin, Dazhen
    Su, Songzhi
    Li, Shaozi
    COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2021, PT I, 2022, 1491 : 437 - 450
  • [39] Semi-Supervised Video Object Segmentation Based on Local and Global Consistency Learning
    Liang, Huagang
    Liu, Lihua
    Bo, Ying
    Zuo, Chao
    IEEE ACCESS, 2021, 9 : 127293 - 127304
  • [40] Semi-supervised Domain Adaptation for Weakly Labeled Semantic Video Object Segmentation
    Wang, Huiling
    Raiko, Tapani
    Lensu, Lasse
    Wang, Tinghuai
    Karhunen, Juha
    COMPUTER VISION - ACCV 2016, PT I, 2017, 10111 : 163 - 179