HMS-Net: Hierarchical Multi-Scale Sparsity-Invariant Network for Sparse Depth Completion

被引：97

作者：

Huang, Zixuan ^{[1
,2
]}

Fan, Junming ^{[1
]}

Cheng, Shenggan ^{[1
]}

Yi, Shuai ^{[1
]}

Wang, Xiaogang ^{[3
]}

Li, Hongsheng ^{[3
]}

机构：

[1] SenseTime Res, Beijing 100080, Peoples R China

[2] Univ Wisconsin, Dept Comp Sci, 1210 W Dayton St, Madison, WI 53706 USA

[3] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2020年 / 29卷

关键词：

Depth completion; convolutional neural network; sparsity-invariant operations; RECOVERY;

D O I：

10.1109/TIP.2019.2960589

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dense depth cues are important and have wide applications in various computer vision tasks. In autonomous driving, LIDAR sensors are adopted to acquire depth measurements around the vehicle to perceive the surrounding environments. However, depth maps obtained by LIDAR are generally sparse because of its hardware limitation. The task of depth completion attracts increasing attention, which aims at generating a dense depth map from an input sparse depth map. To effectively utilize multi-scale features, we propose three novel sparsity-invariant operations, based on which, a sparsity-invariant multi-scale encoder-decoder network (HMS-Net) for handling sparse inputs and sparse feature maps is also proposed. Additional RGB features could be incorporated to further improve the depth completion performance. Our extensive experiments and component analysis on two public benchmarks, KITTI depth completion benchmark and NYU-depth-v2 dataset, demonstrate the effectiveness of the proposed approach. As of Aug. 12th, 2018, on KITTI depth completion leaderboard, our proposed model without RGB guidance ranks 1st among all peer-reviewed methods without using RGB information, and our model with RGB guidance ranks 2nd among all RGB-guided methods.

引用

页码：3429 / 3441

页数：13

共 46 条

[1]

[Anonymous], P EUR C COMPUT VIS

[2]

[Anonymous], 2018, P EUR C COMP VIS

[3]

[Anonymous], IEEE T PATTERN ANAL

[4]

[Anonymous], 2019, ARXIV190205356

[5]

[Anonymous], P INT C LEARN REPR

[6] The Fast Bilateral Solver [J].

Barron, Jonathan T. ;

Poole, Ben .

COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 :617-632

[7] Efficient Spatio-Temporal Hole Filling Strategy for Kinect Depth Maps [J].

Camplani, Massimo ;

Salgado, Luis .

THREE-DIMENSIONAL IMAGE PROCESSING (3DIP) AND APPLICATIONS II, 2012, 8290

[8]

Chen L, 2012, INT C PATT RECOG, P3070

[9] Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network [J].

Cheng, Xinjing ;

Wang, Peng ;

Yang, Ruigang .

COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 :108-125

[10] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

← 1 2 3 4 5 →