Region Deformer Networks for Unsupervised Depth Estimation from Unconstrained Monocular Videos

被引:0
作者
Xu, Haofei [1 ]
Zheng, Jianmin [2 ]
Cai, Jianfei [2 ]
Zhang, Juyong [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Nanyang Technol Univ, Singapore, Singapore
来源
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2019年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While learning based depth estimation from images/videos has achieved substantial progress, there still exist intrinsic limitations. Supervised methods are limited by a small amount of ground truth or labeled data and unsupervised methods for monocular videos are mostly based on the static scene assumption, not performing well on real world scenarios with the presence of dynamic objects. In this paper, we propose a new learning based method consisting of DepthNet, PoseNet and Region Deformer Networks (RDN) to estimate depth from unconstrained monocular videos without ground truth supervision. The core contribution lies in RDN for proper handling of rigid and non-rigid motions of various objects such as rigidly moving cars and deformable humans. In particular, a deformation based motion representation is proposed to model individual object motion on 2D images This representation enables our method to be applicable to diverse unconstrained monocular videos. Our method can not only achieve the state-of-the-art results on standard benchmarks KITTI and Cityscapes, but also show promising results on a crowded pedestrian tracking dataset, which demonstrates the effectiveness of the deformation based motion representation. Code and trained models are available at https://github.com/haofeixu/rdn4depth.
引用
收藏
页码:5685 / 5691
页数:7
相关论文
共 26 条
[11]   Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue [J].
Garg, Ravi ;
VijayKumar, B. G. ;
Carneiro, Gustavo ;
Reid, Ian .
COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :740-756
[12]  
Geiger A., 2012, C COMP VIS PATT REC
[13]  
He K, 2017, P IEEE INT C COMP VI, DOI DOI 10.1109/ICCV.2017.322
[14]  
Jaderberg M., 2015, Neural Inf. Process. Syst., V28, P2017, DOI DOI 10.48550/ARXIV.1506.02025
[15]  
Kingma DP, 2014, ARXIV
[16]  
Li B, 2015, PROC CVPR IEEE, P1119, DOI 10.1109/CVPR.2015.7298715
[17]   Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields [J].
Liu, Fayao ;
Shen, Chunhua ;
Lin, Guosheng ;
Reid, Ian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (10) :2024-2039
[18]   Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints [J].
Mahjourian, Reza ;
Wicke, Martin ;
Angelova, Anelia .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5667-5675
[19]  
MAYER E, 2018, INT LAW CLIMATE CHAN, P1
[20]   Dense Monocular Depth Estimation in Complex Dynamic Scenes [J].
Ranftl, Rene ;
Vineetl, Vibhav ;
Chen, Qifeng ;
Koltun, Vladlen .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4058-4066