Learning Video Object Segmentation from Static Images

被引:372
作者
Perazzi, Federico [1 ,2 ]
Khoreva, Anna [3 ]
Benenson, Rodrigo [3 ]
Schiele, Bernt [3 ]
Sorkine-Hornung, Alexander [1 ]
机构
[1] Disney Res, Saarbrucken, Germany
[2] Swiss Fed Inst Technol, Saarbrucken, Germany
[3] Max Planck Inst Informat, Saarbrucken, Germany
来源
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) | 2017年
关键词
D O I
10.1109/CVPR.2017.372
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inspired by recent advances of deep learning in instance segmentation and object tracking, we introduce the concept of convnet-based guidance applied to video object segmentation. Our model proceeds on a per-frame basis, guided by the output of the previous frame towards the object of interest in the next frame. We demonstrate that highly accurate object segmentation in videos can be enabled by using a convolutional neural network (convnet) trained with static images only. The key component of our approach is a combination of offline and online learning strategies, where the former produces a refined mask from the previous' frame estimate and the latter allows to capture the appearance of the specific object instance. Our method can handle different types of input annotations such as bounding boxes and segments while leveraging an arbitrary amount of annotated frames. Therefore our system is suitable for diverse applications with different requirements in terms of accuracy and efficiency. In our extensive evaluation, we obtain competitive results on three different datasets, independently from the type of input annotation.
引用
收藏
页码:3491 / 3500
页数:10
相关论文
共 57 条
[1]  
[Anonymous], 2015, CVPR
[2]  
[Anonymous], 2015, CVPR
[3]  
[Anonymous], 2016, CVPR
[4]  
[Anonymous], TOG
[5]  
[Anonymous], HCOMP
[6]  
[Anonymous], TPAMI
[7]  
[Anonymous], ICCV
[8]  
[Anonymous], 2016, PAMI
[9]  
[Anonymous], 2016, CVPR
[10]  
[Anonymous], ACM T GRAPHICS TOG