Lightweight monocular absolute depth estimation based on attention mechanism

被引：1

作者：

Jin, Jiayu ^{[1
,2
]}

Tao, Bo ^{[1
]}

Qian, Xinbo ^{[2
,3
]}

Hu, Jiaxin ^{[3
]}

Li, Gongfa ^{[4
]}

机构：

[1] Wuhan Univ Sci & Technol, Key Lab Met Equipment & Control Technol, Minist Educ, Wuhan, Peoples R China

[2] Wuhan Univ Sci & Technol, Hubei Key Lab Mech Transmiss & Mfg Engn, Wuhan, Peoples R China

[3] Wuhan Univ Sci & Technol, Precis Mfg Inst, Wuhan, Peoples R China

[4] Wuhan Univ Sci & Technol, Res Ctr Biomimet Robot & Intelligent Measurement &, Wuhan, Peoples R China

来源：

JOURNAL OF ELECTRONIC IMAGING | 2024年 / 33卷 / 02期

关键词：

lightweight network; deep learning; monocular depth estimation; channel attention; self-supervised;

D O I：

10.1117/1.JEI.33.2.023010

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

To solve the problem of obtaining a higher accuracy at the expense of redundant models, we propose a network architecture. We utilize a lightweight network that retains the high-precision advantage of the transformer and effectively combines it with convolutional neural network. By greatly reducing the training parameters, this approach achieves high precision, making it well suited for deployment on edge devices. A detail highlight module (DHM) is added to effectively fuse information from multiple scales, making the depth of prediction more accurate and clearer. A dense geometric constraints module is introduced to recover accurate scale factors in autonomous driving without additional sensors. Experimental results demonstrate that our model improves the accuracy from 98.1% to 98.3% compared with Monodepth2, and the model parameters are reduced by about 80%.

引用

页数：13

共 44 条

[1] Visual SLAM in dynamic environments based on object detection [J].

Ai, Yong-bao ;

Rui, Ting ;

Yang, Xiao-qiang ;

He, Jia-lin ;

Fu, Lei ;

Li, Jian-bin ;

Lu, Ming .

DEFENCE TECHNOLOGY, 2021, 17 (05) :1712-1721

[2]

Bae J, 2022, Arxiv, DOI arXiv:2205.11083

[3] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[4] Unsupervised monocular depth and ego-motion learning with structure and semantics [J].

Casser, Vincent ;

Pirk, Soeren ;

Mahjourian, Reza ;

Angelova, Anelia .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, :381-388

[5]

Dosovitskiy A., 2021, INT C LEARNING REPRE

[6]

El-Nouby A, 2021, ADV NEUR IN

[7] Dual Attention Network for Scene Segmentation [J].

Fu, Jun ;

Liu, Jing ;

Tian, Haijie ;

Li, Yong ;

Bao, Yongjun ;

Fang, Zhiwei ;

Lu, Hanqing .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149

[8]

Gallagher L., 2021, A hybrid sparse-dense monocular SLAM system for autonomous driving

[9] Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue [J].

Garg, Ravi ;

VijayKumar, B. G. ;

Carneiro, Gustavo ;

Reid, Ian .

COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :740-756

[10] Vision meets robotics: The KITTI dataset [J].

Geiger, A. ;

Lenz, P. ;

Stiller, C. ;

Urtasun, R. .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237

← 1 2 3 4 5 →