General and Task-Oriented Video Segmentation

被引：0

作者：

Chen, Mu ^{[1
]}

Li, Liulei ^{[1
]}

Wang, Wenguan ^{[2
]}

Quan, Ruijie ^{[2
]}

Yang, Yi ^{[2
]}

机构：

[1] Univ Technol Sydney, ReLER Lab, AAII, Ultimo, Australia

[2] Zhejiang Univ, ReLER Lab, CCAI, Hangzhou, Peoples R China

来源：

COMPUTER VISION-ECCV 2024, PT VII | 2025年 / 15065卷

关键词：

Video segmentation; General solution; Task-orientation; INSTANCE; TRANSFORMER; ATTENTION; SHAPE;

D O I：

10.1007/978-3-031-72667-5_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present GVSEG, a general video segmentation framework for addressing four different video segmentation tasks (i.e., instance, semantic, panoptic, and exemplar-guided) while maintaining an identical architectural design. Currently, there is a trend towards developing general video segmentation solutions that can be applied across multiple tasks. This streamlines research endeavors and simplifies deployment. However, such a highly homogenized framework in current design, where each element maintains uniformity, could overlook the inherent diversity among different tasks and lead to suboptimal performance. To tackle this, GVSEG: i) provides a holistic disentanglement and modeling for segment targets, thoroughly examining them from the perspective of appearance, position, and shape, and on this basis, ii) reformulates the query initialization, matching and sampling strategies in alignment with the task-specific requirement. These architecture-agnostic innovations empower GVSEG to effectively address each unique task by accommodating the specific properties that characterize them. Extensive experiments on seven gold-standard benchmark datasets demonstrate that GVSEG surpasses all existing specialized/general solutions by a significant margin on four different video segmentation tasks.

引用

页码：72 / 92

页数：21

共 50 条

[21] Realtime Human Segmentation in Video
Zhang, Tairan
Lang, Congyan
Xing, Junliang
MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 206 - 217
[22] Video segmentation techniques for news
Philips, M
Wolf, W
MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS, 1996, 2916 : 243 - 251
[23] Lecture Video Segmentation and Indexing
Ma, Di
Agam, Gady
DOCUMENT RECOGNITION AND RETRIEVAL XIX, 2012, 8297
[24] Video Segmentation with Motion Smoothness
Wen, Chung-Lin
Chen, Bing-Yu
Sato, Yoichi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (04): : 873 - 881
[25] Enhancing face detection in video sequences by video segmentation preprocessing
Liu, Huibin
Fan, Zuoxun
Chen, Qiang
Zhang, Xiaomei
APPLIED INTELLIGENCE, 2023, 53 (03) : 2897 - 2907
[26] Video Segmentation Based on Strong Target Constrained Video Saliency
Zhang, Long
Liu, Yujun
Han, Shoudong
2017 2ND INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC 2017), 2017, : 356 - 360
[27] Enhancing face detection in video sequences by video segmentation preprocessing
Huibin Liu
Zuoxun Fan
Qiang Chen
Xiaomei Zhang
Applied Intelligence, 2023, 53 : 2897 - 2907
[28] Global and Compact Video Context Embedding for Video Semantic Segmentation
Sun, Lei
Liu, Yun
Sun, Guolei
Wu, Min
Xu, Zhijie
Wang, Kaiwei
Van Gool, Luc
IEEE ACCESS, 2024, 12 : 135589 - 135600
[29] An unequal segmentation algorithm for video in video-on-demand system
Yang, C
Xu, CY
Liu, ZL
Liu, WZ
CHINESE JOURNAL OF ELECTRONICS, 2002, 11 (01): : 99 - 103
[30] Video object segmentation using SVMs
Zhao, Y
Li, HL
Ahalt, SC
7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTER SCIENCE AND ENGINEERING, 2003, : 333 - 337

← 1 2 3 4 5 →