Video Object Segmentation without Temporal Information

被引：209

作者：

Maninis, Kevis-Kokitsi ^{[1
]}

Caelles, Sergi ^{[1
]}

Chen, Yuhua ^{[1
]}

Pont-Tuset, Jordi ^{[1
]}

Leal-Taixe, Laura ^{[2
]}

Cremers, Daniel ^{[2
]}

Van Gool, Luc ^{[1
]}

机构：

[1] ETHZ, CH-8092 Zurich, Switzerland

[2] TUM, D-80333 Munich, Germany

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2019年 / 41卷 / 06期

基金：

欧盟地平线“2020”;

关键词：

Video object segmentation; convolutional neural networks; semantic segmentation; instance segmentation;

D O I：

10.1109/TPAMI.2018.2838670

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Video Object Segmentation, and video processing in general, has been historically dominated by methods that rely on the temporal consistency and redundancy in consecutive video frames. When the temporal smoothness is suddenly broken, such as when an object is occluded, or some frames are missing in a sequence, the result of these methods can deteriorate significantly. This paper explores the orthogonal approach of processing each frame independently, i.e., disregarding the temporal information. In particular, it tackles the task of semi-supervised video object segmentation: the separation of an object from the background in a video, given its mask in the first frame. We present Semantic One-Shot Video Object Segmentation (OSVOSS), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one shot). We show that instance-level semantic information, when combined effectively, can dramatically improve the results of our previous method, OSVOS. We perform experiments on two recent single-object video segmentation databases, which show that OSVOSS is both the fastest and most accurate method in the state of the art. Experiments on multi-object video segmentation show that OSVOSS obtains competitive results.

引用

页码：1515 / 1530

页数：16

共 50 条

[21] Articulated video object extraction by the combination of spatial and temporal segmentation
Gao, J
Thakoor, N
Proceedings of the Fourth IASTED International Conference on Visualization, Imaging, and Image Processing, 2004, : 21 - 24
[22] A temporal attention based appearance model for video object segmentation
Hui Wang
Weibin Liu
Weiwei Xing
Applied Intelligence, 2022, 52 : 2290 - 2300
[23] Tsanet: Temporal and Scale Alignment for Unsupervised Video Object Segmentation
Lee, Seunghoon
Cho, Suhwan
Lee, Dogyoon
Lee, Minhyeok
Lee, Sangyoun
Proceedings - International Conference on Image Processing, ICIP, 2023, : 1535 - 1539
[24] AUTOMATIC SEGMENTATION OF VIDEO OBJECT PLANES IN MPEG-4 BASED ON SPATIO-TEMPORAL INFORMATION
Xia Jinxiang Huang ShunjiDept of Electronic Engineering UEST of China Chengdu
Journal of Electronics, 2004, (03) : 206 - 212
[25] AUTOMATIC SEGMENTATION OF VIDEO OBJECT PLANES IN MPEG-4 BASED ON SPATIO-TEMPORAL INFORMATION
Xia Jinxiang Huang Shunji(Dept of Electronic Engineering
Journal of Electronics(China), 2004, (03) : 206 - 212
[26] Video segmentation using spatio-temporal information
Kim, YW
Ho, YS
IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 785 - 788
[27] Stereoscopic video object segmentation based on depth and edge information
Lue Chaohui
Yuan Dun
Zhang Qin
INTERNATIONAL SYMPOSIUM ON PHOTOELECTRONIC DETECTION AND IMAGING 2007: RELATED TECHNOLOGIES AND APPLICATIONS, 2008, 6625 : L6250 - L6250
[28] A robust video object segmentation scheme with prestored background information
Pan, JH
Lin, CW
Gu, C
Sun, MT
2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL III, PROCEEDINGS, 2002, : 803 - 806
[29] Interactive object extraction using spatio-temporal video segmentation
Okubo, Hidehiko, 1600, Inst. of Image Information and Television Engineers (68):
[30] Temporal Transductive Inference for Few-Shot Video Object Segmentation
Siam, Mennatullah
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,

← 1 2 3 4 5 →