Semantic Video CNNs through Representation Warping

被引：140

作者：

Gadde, Raghudeep ^{[1
,3
]}

Jampani, Varun ^{[1
,4
]}

Gehler, Peter V. ^{[1
,2
,3
]}

机构：

[1] MPI Intelligent Syst, Stuttgart, Germany

[2] Univ Wurzburg, Wurzburg, Germany

[3] Bernstein Ctr Computat Neurosci, Berlin, Germany

[4] NVIDIA, Santa Clara, CA USA

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2017年

关键词：

SEGMENTATION;

D O I：

10.1109/ICCV.2017.477

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, we propose a technique to convert CNN models for semantic segmentation of static images into CNNs for video data. We describe a warping method that can be used to augment existing architectures with very little extra computational cost. This module is called Net-Warp and we demonstrate its use for a range of network architectures. The main design principle is to use optical flow of adjacent frames for warping internal network representations across time. A key insight of this work is that fast optical flow methods can be combined with many different CNN architectures for improved performance and end-to-end training. Experiments validate that the proposed approach incurs only little extra computational cost, while improving performance, when video streams are available. We achieve new state-of-the-art results on the CamVid and Cityscapes benchmark datasets and show consistent improvements over different baseline networks. Our code and models are available at http://segmentation.is.tue.mpg.de

引用

页码：4463 / 4472

页数：10

共 48 条

[1]

[Anonymous], 2016, CoRR

[2]

[Anonymous], EUR C COMP VIS

[3]

[Anonymous], 2016, ECCV

[4]

[Anonymous], 2009, BMVC

[5]

Bailer C., 2015, P IEEE INT C COMP VI

[6] Segmentation and Recognition Using Structure from Motion Point Clouds [J].

Brostow, Gabriel J. ;

Shotton, Jamie ;

Fauqueur, Julien ;

Cipolla, Roberto .

COMPUTER VISION - ECCV 2008, PT I, PROCEEDINGS, 2008, 5302 :44-+

[7] Semantic object classes in video: A high-definition ground truth database [J].

Brostow, Gabriel J. ;

Fauqueur, Julien ;

Cipolla, Roberto .

PATTERN RECOGNITION LETTERS, 2009, 30 (02) :88-97

[8]

Chen A.Y.C., 2011, IEEE Workshop on Applications of Computer Vision, P614

[9]

Chen L.-C., 2014, ARXIV

[10]

Chuang YY, 2002, ACM T GRAPHIC, V21, P243, DOI 10.1145/566570.566572

← 1 2 3 4 5 →