Boosting LightWeight Depth Estimation via Knowledge Distillation

被引:10
作者
Hu, Junjie [1 ]
Fan, Chenyou [2 ]
Jiang, Hualie [3 ]
Guo, Xiyue [4 ]
Gao, Yuan [1 ]
Lu, Xiangyong [5 ]
Lam, Tin Lun [1 ,3 ]
机构
[1] Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen, Peoples R China
[2] South China Normal Univ, Guangzhou, Peoples R China
[3] Chinese Univ Hong Kong, Shenzhen, Peoples R China
[4] Zhejiang Univ, Hangzhou, Peoples R China
[5] Tohoku Univ, Sendai, Miyagi, Japan
来源
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, KSEM 2023 | 2023年 / 14117卷
关键词
Depth estimation; lightweight network; Knowledge distillation; Auxiliary data;
D O I
10.1007/978-3-031-40283-8_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monocular depth estimation (MDE) methods are often either too computationally expensive or not accurate enough due to the trade-off between model complexity and inference performance. In this paper, we propose a lightweight network that can accurately estimate depth maps using minimal computing resources. We achieve this by designing a compact model that maximally reduces model complexity. To improve the performance of our lightweight network, we adopt knowledge distillation (KD) techniques. We consider a large network as an expert teacher that accurately estimates depth maps on the target domain. The student, which is the lightweight network, is then trained to mimic the teacher's predictions. However, this KD process can be challenging and insufficient due to the large model capacity gap between the teacher and the student. To address this, we propose to use auxiliary unlabeled data to guide KD, enabling the student to better learn from the teacher's predictions. This approach helps fill the gap between the teacher and the student, resulting in improved data-driven learning. The experiments show that our method achieves comparable performance to state-of-the-art methods while using only 1% of their parameters. Furthermore, our method outperforms previous lightweight methods regarding inference accuracy, computational efficiency, and generalizability.
引用
收藏
页码:27 / 39
页数:13
相关论文
共 28 条
[1]   Real-Time Single Image Depth Perception in the Wild with Handheld Devices [J].
Aleotti, Filippo ;
Zaccaroni, Giulio ;
Bartolomei, Luca ;
Poggi, Matteo ;
Tosi, Fabio ;
Mattoccia, Stefano .
SENSORS, 2021, 21 (01) :1-17
[2]   Network Dissection: Quantifying Interpretability of Deep Visual Representations [J].
Bau, David ;
Zhou, Bolei ;
Khosla, Aditya ;
Oliva, Aude ;
Torralba, Antonio .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3319-3327
[3]  
Chen XT, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P694
[4]   DeepFactors: Real-Time Probabilistic Dense Monocular SLAM [J].
Czarnowski, Jan ;
Laidlow, Tristan ;
Clark, Ronald ;
Davison, Andrew J. .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) :721-728
[5]   ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].
Dai, Angela ;
Chang, Angel X. ;
Savva, Manolis ;
Halber, Maciej ;
Funkhouser, Thomas ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443
[6]  
Mendes RD, 2020, Arxiv, DOI arXiv:2010.06626
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]  
Eigen D, 2014, ADV NEUR IN, V27
[9]   Deep Ordinal Regression Network for Monocular Depth Estimation [J].
Fu, Huan ;
Gong, Mingming ;
Wang, Chaohui ;
Batmanghelich, Kayhan ;
Tao, Dacheng .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2002-2011
[10]   Semantic Histogram Based Graph Matching for Real-Time Multi-Robot Global Localization in Large Scale Environment [J].
Guo, Xiyue ;
Hu, Junjie ;
Chen, Junfeng ;
Deng, Fuqin ;
Lam, Tin Lun .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04) :8349-8356