Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation

被引：119

作者：

Hou, Yuenan ^{[1
]}

Zhu, Xinge ^{[2
]}

Ma, Yuexin ^{[3
]}

Loy, Chen Change ^{[4
]}

Li, Yikang ^{[1
]}

机构：

[1] Shanghai AI Lab, Shanghai, Peoples R China

[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China

[3] ShanghaiTech Univ, Shanghai, Peoples R China

[4] Nanyang Technol Univ, S Lab, Singapore, Singapore

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.00829

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation. Directly employing previous distillation approaches yields inferior results due to the intrinsic challenges of point cloud, i.e., sparsity, randomness and varying density. To tackle the aforementioned problems, we propose the Point-to-Voxel Knowledge Distillation(PVD), which transfers the hidden knowledge from both point level and voxel level. Specifically, we first leverage both the pointwise and voxelwise output distillation to complement the sparse supervision signals. Then, to better exploit the structural information, we divide the whole point cloud into several supervoxels and design a difficulty-aware sampling strategy to more frequently sample supervoxels containing less-frequent classes and faraway objects. On these supervoxels, we propose inter-point and inter-voxel affinity distillation, where the similarity information between points and voxels can help the student model better capture the structural information of the surrounding environment.We conduct extensive experiments on two popular LiDAR segmentation benchmarks, i.e., nuScenes [3] and SemanticKITTI [1]. On both benchmarks, our PVD-consistently outperforms previous distillation approaches by a large margin on three representative backbones, i.e.,Cylinder3D [36, 37], SPVNAS [25] and MinkowskiNet [5]. Notably, on the challenging nuScenes and SemanticKITTI datasets, our method can achieve roughly 75% MACs reduction and 2x speedup on the competitive Cylinder3D model and rank 1st on the SemanticKITTI leaderboard among all published algorithms(1). Our code is available athttps:// github.com/cardwing/Codes-for-PVKD.

引用

页码：8469 / 8478

页数：10

共 37 条

[21]

Milioto A, 2019, IEEE INT C INT ROBOT, P4213, DOI 10.1109/IROS40897.2019.8967762

[22] Relational Knowledge Distillation [J].

Park, Wonpyo ;

Kim, Dongju ;

Lu, Yan ;

Cho, Minsu .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3962-3971

[23]

Qi CR, 2017, ADV NEUR IN, V30

[24]

Sayre NF, 2015, ESSENTIAL CONCEPTS OF GLOBAL ENVIRONMENTAL GOVERNANCE, P21

[25]

Shu Changyong, 2020, IEEE INT C COMP VIS

[26] Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution [J].

Tang, Haotian ;

Liu, Zhijian ;

Zhao, Shengyu ;

Lin, Yujun ;

Lin, Ji ;

Wang, Hanrui ;

Han, Song .

COMPUTER VISION - ECCV 2020, PT XXVIII, 2020, 12373 :685-702

[27] KPConv: Flexible and Deformable Convolution for Point Clouds [J].

Thomas, Hugues ;

Qi, Charles R. ;

Deschaud, Jean-Emmanuel ;

Marcotegui, Beatriz ;

Goulette, Francois ;

Guibas, Leonidas J. .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6420-6429

[28] Similarity-Preserving Knowledge Distillation [J].

Tung, Frederick ;

Mori, Greg .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1365-1374

[29] Strong Evidence of the Role of H2O in Affecting Methanol Selectivity from CO2 Hydrogenation over Cu-ZnO-ZrO2 [J].

Wang, Yuhao ;

Gao, Wengui ;

Li, Kongzhai ;

Zheng, Yane ;

Xie, Zhenhua ;

Na, Wei ;

Chen, Jingguang G. ;

Wang, Hua .

CHEM, 2020, 6 (02) :419-430

[30]

Wu BC, 2018, IEEE INT CONF ROBOT, P1887

← 1 2 3 4 →