Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation

被引:280
作者
Xu, Dan [1 ]
Wang, Wei [1 ]
Tang, Hao [1 ]
Liu, Hong [2 ]
Sebe, Nicu [1 ]
Ricci, Elisa [1 ,3 ]
机构
[1] Univ Trento, Multimedia & Human Understanding Grp, Trento, Italy
[2] Peking Univ, Shenzhen Grad Sch, Key Lab Machine Percept, Beijing, Peoples R China
[3] Fdn Bruno Kessler, Technol Vis Grp, Trento, Italy
来源
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR.2018.00412
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent works have shown the benefit of integrating Conditional Random Fields (CRFs) models into deep architectures for improving pixel-level prediction tasks. Following this line of research, in this paper we introduce a novel approach for monocular depth estimation. Similarly to previous works, our method employs a continuous CRF to fuse multi-scale information derived from different layers of a front-end Convolutional Neural Network (CNN). Differently from past works, our approach benefits from a structured attention model which automatically regulates the amount of information transferred between corresponding features at different scales. Importantly, the proposed attention model is seamlessly integrated into the CRF allowing end-to-end training of the entire architecture. Our extensive experimental evaluation demonstrates the effectiveness of the proposed method which is competitive with previous methods on the KITH benchmark and outperforms the state of the art on the NYU Depth V2 dataset.
引用
收藏
页码:3917 / 3925
页数:9
相关论文
共 38 条
[1]  
[Anonymous], ARXIV170802287
[2]  
[Anonymous], 2005, P INT C NEUR INF PRO
[3]  
[Anonymous], 2015, PROC CVPR IEEE, DOI 10.1109/CVPR.2015.7298642
[4]  
[Anonymous], 2016, Lecture Notes in Computer Science, DOI [10.1007/978-3-319-46493-0_38, DOI 10.1007/978-3-319-46493-0_38]
[5]  
[Anonymous], NIPS
[6]  
Buyssens Pierre., 2012, ACCV
[7]   Estimating Depth From Monocular Images as Classification Using Deep Fully Convolutional Residual Networks [J].
Cao, Yuanzhouhan ;
Wu, Zifeng ;
Shen, Chunhua .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (11) :3174-3182
[8]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[9]   Attention to Scale: Scale-aware Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Yang, Yi ;
Wang, Jiang ;
Xu, Wei ;
Yuille, Alan L. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3640-3649
[10]  
Eigen C., 2014, ADV NEURAL INF PROCE, V27, P2366, DOI DOI 10.5555/2969033.2969091