Lightweight monocular depth estimation network for robotics using intercept block GhostNet

被引：1

作者：

Ardiyanto, Igi ^{[1
]}

Al-Fahsi, Resha ^{[1
]}

机构：

[1] Univ Gadjah Mada, Dept Elect & Informat Engn, Fac Engn, Jl Grafika 2, Yogyakarta 55281, Indonesia

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2025年 / 19卷 / 01期

关键词：

Monocular depth estimation; Lightweight network; Intercept block GhostNet;

D O I：

10.1007/s11760-024-03720-1

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This study introduces a novel deep learning approach for monocular depth estimation that exhibits excellent performance while utilizing significantly fewer computational and memory resources. Our proposed method, called IBG-Mono, involves the use of the Intercept Block, in conjunction with cost-effective GhostNet components, to efficiently extract relevant information from the input image. The Intercept Block incorporates a mechanism that facilitates the retention and integration of low-resolution feature maps derived from the input image with the downsampled feature maps at different resolutions. On the other hand, the GhostNet components facilitate the efficient processing of coarse-grained feature maps by the Intercept Block, while also enabling their seamless integration with downsampled feature maps. Furthermore, a progressive downsampling is employed to maintain spatial alignment between feature maps of varying resolutions and those that have undergone downsampling. Our study involves conducting extensive experiments using NYU Depth V2 and KITTI dataset and presenting comparative results to the state-of-the-art lightweight monocular depth estimation with only 0.63M parameters and 0.31 GMACs. The results demonstrate the superiority of our system in comparison to other existing lightweight monocular depth estimation approaches.

引用

页数：14

共 45 条

[1] Alhashim I, 2019, Arxiv, DOI [arXiv:1812.11941, DOI 10.48550/ARXIV.1812.11941]
[2] [Anonymous], TENSORFLOW LARGE SCA
[3] Chen TQ, 2018, PROCEEDINGS OF THE 13TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P579
[4] Chen XT, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P694
[5] Xception: Deep Learning with Depthwise Separable Convolutions
Chollet, Francois
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
[6] Eigen D, 2014, ADV NEUR IN, V27
[7] Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture
Eigen, David
Fergus, Rob
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2650 - 2658
[8] Elkerdawy S, 2019, IEEE IMAGE PROC, P4290, DOI [10.1109/ICIP.2019.8803544, 10.1109/icip.2019.8803544]
[9] CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth
Facil, Jose M.
Ummenhofer, Benjamin
Zhou, Huizhong
Montesano, Luis
Brox, Thomas
Civera, Javier
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11818 - 11827
[10] Vision meets robotics: The KITTI dataset
Geiger, A.
Lenz, P.
Stiller, C.
Urtasun, R.
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) : 1231 - 1237

← 1 2 3 4 5 →