Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation

被引：57

作者：

Lv, Zhaoyang ^{[1
]}

Kim, Kihwan ^{[2
]}

Troccoli, Alejandro ^{[2
]}

Sun, Deqing ^{[2
]}

Rehg, James M. ^{[1
]}

Kautz, Jan ^{[2
]}

机构：

[1] Georgia Inst Technol, Atlanta, GA 30332 USA

[2] NVIDIA, Santa Clara, CA USA

来源：

COMPUTER VISION - ECCV 2018, PT V | 2018年 / 11209卷

基金：

美国国家科学基金会;

关键词：

Rigidity estimation; Dynamic scene analysis; Scene flow; Motion segmentation;

D O I：

10.1007/978-3-030-01228-1_29

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Estimation of 3D motion in a dynamic scene from a temporal pair of images is a core task in many scene understanding problems. In real-world applications, a dynamic scene is commonly captured by a moving camera (i.e., panning, tilting or hand-held), increasing the task complexity because the scene is observed from different viewpoints. The primary challenge is the disambiguation of the camera motion from scene motion, which becomes more difficult as the amount of rigidity observed decreases, even with successful estimation of 2D image correspondences. Compared to other state-of-the-art 3D scene flow estimation methods, in this paper, we propose to learn the rigidity of a scene in a supervised manner from an extensive collection of dynamic scene data, and directly infer a rigidity mask from two sequential images with depths. With the learned network, we show how we can effectively estimate camera motion and projected scene flow using computed 2D optical flow and the inferred rigidity mask. For training and testing the rigidity network, we also provide a new semi-synthetic dynamic scene dataset (synthetic foreground objects with a real background) and an evaluation split that accounts for the percentage of observed non-rigid pixels. Through our evaluation, we show the proposed framework outperforms current state-of-the-art scene flow estimation methods in challenging dynamic scenes.

引用

页码：484 / 501

页数：18

共 49 条

[1]

[Anonymous], 2016, IEEE C COMP VIS PATT

[2]

[Anonymous], 2015, SIGGRAPH ASIA, DOI DOI 10.1145/2816795.2818013

[3]

Basha T., 2010, CVPR, DOI DOI 10.1109/CVPR.2010.5539791

[4] A Naturalistic Open Source Movie for Optical Flow Evaluation [J].

Butler, Daniel J. ;

Wulff, Jonas ;

Stanley, Garrett B. ;

Black, Michael J. .

COMPUTER VISION - ECCV 2012, PT VI, 2012, 7577 :611-625

[5]

Byravan Arunkumar, 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA), P173, DOI 10.1109/ICRA.2017.7989023

[6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[7]

Dai A., 2017, ACM T GRAPH TOG

[8]

Dellaert F., 2012, Factor graphs and gtsam: A hands-on introduction

[9] FlowNet: Learning Optical Flow with Convolutional Networks [J].

Dosovitskiy, Alexey ;

Fischer, Philipp ;

Ilg, Eddy ;

Haeusser, Philip ;

Hazirbas, Caner ;

Golkov, Vladimir ;

van der Smagt, Patrick ;

Cremers, Daniel ;

Brox, Thomas .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2758-2766

[10] 3D Traffic Scene Understanding from Movable Platforms [J].

Geiger, Andreas ;

Lauer, Martin ;

Wojek, Christian ;

Stiller, Christoph ;

Urtasun, Raquel .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (05) :1012-1025

← 1 2 3 4 5 →