General and Task-Oriented Video Segmentation

被引:0
|
作者
Chen, Mu [1 ]
Li, Liulei [1 ]
Wang, Wenguan [2 ]
Quan, Ruijie [2 ]
Yang, Yi [2 ]
机构
[1] Univ Technol Sydney, ReLER Lab, AAII, Ultimo, Australia
[2] Zhejiang Univ, ReLER Lab, CCAI, Hangzhou, Peoples R China
来源
COMPUTER VISION-ECCV 2024, PT VII | 2025年 / 15065卷
关键词
Video segmentation; General solution; Task-orientation; INSTANCE; TRANSFORMER; ATTENTION; SHAPE;
D O I
10.1007/978-3-031-72667-5_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present GVSEG, a general video segmentation framework for addressing four different video segmentation tasks (i.e., instance, semantic, panoptic, and exemplar-guided) while maintaining an identical architectural design. Currently, there is a trend towards developing general video segmentation solutions that can be applied across multiple tasks. This streamlines research endeavors and simplifies deployment. However, such a highly homogenized framework in current design, where each element maintains uniformity, could overlook the inherent diversity among different tasks and lead to suboptimal performance. To tackle this, GVSEG: i) provides a holistic disentanglement and modeling for segment targets, thoroughly examining them from the perspective of appearance, position, and shape, and on this basis, ii) reformulates the query initialization, matching and sampling strategies in alignment with the task-specific requirement. These architecture-agnostic innovations empower GVSEG to effectively address each unique task by accommodating the specific properties that characterize them. Extensive experiments on seven gold-standard benchmark datasets demonstrate that GVSEG surpasses all existing specialized/general solutions by a significant margin on four different video segmentation tasks.
引用
收藏
页码:72 / 92
页数:21
相关论文
共 50 条
  • [21] Realtime Human Segmentation in Video
    Zhang, Tairan
    Lang, Congyan
    Xing, Junliang
    MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 206 - 217
  • [22] Video segmentation techniques for news
    Philips, M
    Wolf, W
    MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS, 1996, 2916 : 243 - 251
  • [23] Lecture Video Segmentation and Indexing
    Ma, Di
    Agam, Gady
    DOCUMENT RECOGNITION AND RETRIEVAL XIX, 2012, 8297
  • [24] Video Segmentation with Motion Smoothness
    Wen, Chung-Lin
    Chen, Bing-Yu
    Sato, Yoichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (04): : 873 - 881
  • [25] Enhancing face detection in video sequences by video segmentation preprocessing
    Liu, Huibin
    Fan, Zuoxun
    Chen, Qiang
    Zhang, Xiaomei
    APPLIED INTELLIGENCE, 2023, 53 (03) : 2897 - 2907
  • [26] Video Segmentation Based on Strong Target Constrained Video Saliency
    Zhang, Long
    Liu, Yujun
    Han, Shoudong
    2017 2ND INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC 2017), 2017, : 356 - 360
  • [27] Enhancing face detection in video sequences by video segmentation preprocessing
    Huibin Liu
    Zuoxun Fan
    Qiang Chen
    Xiaomei Zhang
    Applied Intelligence, 2023, 53 : 2897 - 2907
  • [28] Global and Compact Video Context Embedding for Video Semantic Segmentation
    Sun, Lei
    Liu, Yun
    Sun, Guolei
    Wu, Min
    Xu, Zhijie
    Wang, Kaiwei
    Van Gool, Luc
    IEEE ACCESS, 2024, 12 : 135589 - 135600
  • [29] An unequal segmentation algorithm for video in video-on-demand system
    Yang, C
    Xu, CY
    Liu, ZL
    Liu, WZ
    CHINESE JOURNAL OF ELECTRONICS, 2002, 11 (01): : 99 - 103
  • [30] Video object segmentation using SVMs
    Zhao, Y
    Li, HL
    Ahalt, SC
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTER SCIENCE AND ENGINEERING, 2003, : 333 - 337