Global and Compact Video Context Embedding for Video Semantic Segmentation

被引:0
|
作者
Sun, Lei [1 ,2 ]
Liu, Yun [3 ]
Sun, Guolei [2 ]
Wu, Min [3 ]
Xu, Zhijie [4 ]
Wang, Kaiwei [1 ]
Van Gool, Luc [2 ]
机构
[1] Zhejiang Univ, Natl Res Ctr Opt Instrumentat, Hangzhou 310027, Peoples R China
[2] Swiss Fed Inst Technol, Comp Vis Lab, CH-8092 Zurich, Switzerland
[3] ASTAR, Inst Infocomm Res I2R, Singapore 138632, Singapore
[4] Univ Huddersfield, Ctr Visual & Immers Comp, Huddersfield HD1 3DH, England
来源
IEEE ACCESS | 2024年 / 12卷
基金
中国国家自然科学基金;
关键词
Semantic segmentation; Context modeling; Feature extraction; Computational modeling; Sun; Optical flow; Shape; Video semantic segmentation; global video context; compact video context; video context embedding; NETWORK;
D O I
10.1109/ACCESS.2024.3409150
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Intuitively, global video context could benefit video semantic segmentation (VSS) if it is designed to simultaneously model global temporal and spatial dependencies for a holistic understanding of the semantic scenes in a video clip. However, we found that the existing VSS approaches focus only on modeling local video context. This paper attempts to bridge this gap by learning global video context for VSS. Apart from the global nature, the video context should also be compact when considering the large number of video feature tokens and the redundancy among nearby video frames. Then, we embed the learned global and compact video context into the features of the target video frame to improve the distinguishability. The proposed VSS method is dubbed Global and Compact Video Context Embedding (GCVCE). With the compact nature, the number of global context tokens is very limited so that GCVCE is flexible and efficient for VSS. Since it may be too challenging to directly abstract a large number of video feature tokens into a small number of global context tokens, we further design a Cascaded Convolutional Downsampling (CCD) module before GCVCE to help it work better. 1.6% improvement in mIoU on the popular VSPW dataset compared to previous state-of-the-art methods demonstrate the effectiveness and efficiency of GCVCE and CCD for VSS. Code and models will be made publicly available.
引用
收藏
页码:135589 / 135600
页数:12
相关论文
共 50 条
  • [41] A video coverless information hiding algorithm based on semantic segmentation
    Pan, Nan
    Qin, Jiaohua
    Tan, Yun
    Xiang, Xuyu
    Hou, Guimin
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2020, 2020 (01)
  • [42] A Video Semantic Segmentation Method based on FCN and Data Argumentation
    Huang, Yuan
    Huang, Qian
    Chen, Qinglong
    Li, Yanping
    Sun, Xiaoqing
    2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019), 2019, : 1478 - 1483
  • [43] Video semantic object segmentation by self-adaptation of DCNN
    Park, Seong-Jin
    Hong, Ki-Sang
    PATTERN RECOGNITION LETTERS, 2018, 112 : 249 - 255
  • [44] 3D video semantic segmentation for wildfire smoke
    Guodong Zhu
    Zhenxue Chen
    Chengyun Liu
    Xuewen Rong
    Weikai He
    Machine Vision and Applications, 2020, 31
  • [45] Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation
    Yin, Yingjie
    Xu, De
    Wang, Xingang
    Zhang, Lei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3884 - 3894
  • [46] Global video object segmentation with spatial constraint module
    Yadang Chen
    Duolin Wang
    Zhiguo Chen
    Zhi-Xin Yang
    Enhua Wu
    Computational Visual Media, 2023, 9 : 385 - 400
  • [47] Survey on fast dense video segmentation techniques
    Monnier, Quentin
    Pouli, Tania
    Kpalma, Kidiyo
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 241
  • [48] Global video object segmentation with spatial constraint module
    Chen, Yadang
    Wang, Duolin
    Chen, Zhiguo
    Yang, Zhi-Xin
    Wu, Enhua
    COMPUTATIONAL VISUAL MEDIA, 2023, 9 (02) : 385 - 400
  • [49] Augmented FCN: rethinking context modeling for semantic segmentation
    Zhang, Dong
    Zhang, Liyan
    Tang, Jinhui
    SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (04)
  • [50] Exploiting semantic segmentation to boost reinforcement learning in video game environments
    Montalvo, Javier
    Garcia-Martin, Alvaro
    Bescos, Jesus
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (07) : 10961 - 10979