Semantic Video CNNs through Representation Warping

被引:140
作者
Gadde, Raghudeep [1 ,3 ]
Jampani, Varun [1 ,4 ]
Gehler, Peter V. [1 ,2 ,3 ]
机构
[1] MPI Intelligent Syst, Stuttgart, Germany
[2] Univ Wurzburg, Wurzburg, Germany
[3] Bernstein Ctr Computat Neurosci, Berlin, Germany
[4] NVIDIA, Santa Clara, CA USA
来源
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2017年
关键词
SEGMENTATION;
D O I
10.1109/ICCV.2017.477
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose a technique to convert CNN models for semantic segmentation of static images into CNNs for video data. We describe a warping method that can be used to augment existing architectures with very little extra computational cost. This module is called Net-Warp and we demonstrate its use for a range of network architectures. The main design principle is to use optical flow of adjacent frames for warping internal network representations across time. A key insight of this work is that fast optical flow methods can be combined with many different CNN architectures for improved performance and end-to-end training. Experiments validate that the proposed approach incurs only little extra computational cost, while improving performance, when video streams are available. We achieve new state-of-the-art results on the CamVid and Cityscapes benchmark datasets and show consistent improvements over different baseline networks. Our code and models are available at http://segmentation.is.tue.mpg.de
引用
收藏
页码:4463 / 4472
页数:10
相关论文
共 48 条
[1]  
[Anonymous], 2016, CoRR
[2]  
[Anonymous], EUR C COMP VIS
[3]  
[Anonymous], 2016, ECCV
[4]  
[Anonymous], 2009, BMVC
[5]  
Bailer C., 2015, P IEEE INT C COMP VI
[6]   Segmentation and Recognition Using Structure from Motion Point Clouds [J].
Brostow, Gabriel J. ;
Shotton, Jamie ;
Fauqueur, Julien ;
Cipolla, Roberto .
COMPUTER VISION - ECCV 2008, PT I, PROCEEDINGS, 2008, 5302 :44-+
[7]   Semantic object classes in video: A high-definition ground truth database [J].
Brostow, Gabriel J. ;
Fauqueur, Julien ;
Cipolla, Roberto .
PATTERN RECOGNITION LETTERS, 2009, 30 (02) :88-97
[8]  
Chen A.Y.C., 2011, IEEE Workshop on Applications of Computer Vision, P614
[9]  
Chen L.-C., 2014, ARXIV
[10]  
Chuang YY, 2002, ACM T GRAPHIC, V21, P243, DOI 10.1145/566570.566572