CFDepthNet: Monocular Depth Estimation Introducing Coordinate Attention and Texture Features

被引:1
作者
Wei, Feng [1 ]
Zhu, Jie [1 ]
Wang, Huibin [1 ]
Shen, Jie [1 ]
机构
[1] Hohai Univ, Sch Comp & Informat, Nanjing 211100, Peoples R China
基金
中国国家自然科学基金;
关键词
Coordinate attention; Texture feature metric loss; Photometric error loss; Monocular depth estimation;
D O I
10.1007/s11063-024-11477-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Handling the depth estimation of low-texture regions using photometric error loss is a challenge due to the difficulty of achieving convergence due to the presence of multiple local minima for pixels in low-texture regions (or even no-texture regions). In this paper, based on the photometric loss, we also introduce texture feature metric loss as a constraint and combine the coordinate attention mechanism to improve the depth map's texture quality and edge detail. This paper uses a simple yet compact network structure, a unique loss function, and a relatively flexible embedded attention module, which is more effective and easier to arrange in robotic platforms with weak arithmetic power. The tests show that our network structure not only shows high quality and state-of-the-art results on the KITTI dataset, but the same training results also perform well on the cityscapes and Make3D datasets.
引用
收藏
页数:17
相关论文
共 44 条
[1]   Enhancing self-supervised monocular depth estimation with traditional visual odometry [J].
Andraghetti, Lorenzo ;
Myriokefalitakis, Panteleimon ;
Dovesi, Pier Luigi ;
Luque, Belen ;
Poggi, Matteo ;
Pieropan, Alessandro ;
Mattoccia, Stefano .
2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, :424-433
[2]  
[Anonymous], 2018, Every pixel counts: unsupervised geometry learning with holistic 3D motion understanding
[3]  
[Anonymous], 2018, Digging into self-supervised monocular depth estimation
[4]  
Atapour-Abarghouei Amir, 2018, CVPR
[5]  
Bian JW, 2019, ADV NEUR IN, V32
[6]  
Casser V, 2019, AAAI CONF ARTIF INTE, P8001
[7]   Self-supervised Learning with Geometric Constraints in Monocular Video Connecting Flow, Depth, and Camera [J].
Chen, Yuhua ;
Schmid, Cordelia ;
Sminchisescu, Cristian .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :7062-7071
[8]  
Eigen D, 2014, ADV NEUR IN, V27
[9]   Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture [J].
Eigen, David ;
Fergus, Rob .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2650-2658
[10]  
Engel J., 2013, arXiv, p1307.4663