3D point cloud object detection algorithm based on Transformer

被引：0

作者：

Liu M. ^{[1
]}

Yang Q. ^{[2
]}

Hu G. ^{[2
,3
]}

Guo Y. ^{[4
]}

Zhang J. ^{[2
]}

机构：

[1] Shenyang Aircraft Design Research Institute, Shenyang

[2] School of Electronics and Information, Northwestern Polytechnical University, Xi′an

[3] CSSC Systems Engineering Research Institute, Beijing

[4] No.1 Military Representative Office of Equipment Department of PLA Airforce in Shenyang, Shenyang

来源：

Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University | 2023年 / 41卷 / 06期

关键词：

deep learning; heat map initialization; spatial modulation attention mechanism; target detection; Transformer;

D O I：

10.1051/jnwpu/20234161190

中图分类号：

学科分类号：

摘要：

In response to the difficulty in deploying anchor box based methods in 3D object detection due to the increase in spatial dimensions, this paper studies a point cloud object detection algorithm based on set prediction. This article proposes a Transformer based 3D point cloud object detection algorithm, and combines the characteristics of point clouds in autonomous driving scenarios to propose an improved spatial modulation attention and heat map initialization strategy for training acceleration and query initialization, achieving good detection performance in shallow networks. This article compares it with other algorithms on the KITTI dataset, and the results show that our algorithm has reached an advanced level in performance. We also conducted ablation experiments on the main components of the algorithm to verify the contribution of each module to the detection effect. ©2023 Journal of Northwestern Polytechnical University.

引用

页码：1190 / 1197

页数：7

共 17 条

[1]

LI Kequan, CHEN Yan, LIU Jiachen, Et al., Survey of deep learning-based object detection algorithms, Computer Engineering, 48, 7, pp. 1-12, (2022)

[2]

DONG Wenxuan, LIANG Hongtao, LIU Guozhu, Et al., Review of deep convolution applied to target detection algorithms, Journal of Frontiers of Computer Science and Technology, 16, 5, pp. 1025-1042, (2022)

[3]

VASWANI A, SHAZEER N, PARMAR N, Et al., Attention is all you need, 31st International Conference on Neural Information Processing Systems, pp. 6000-6010, (2017)

[4]

KIRILLOV A, USUNIER N, CARION N, Et al., End-to-end object detection with transformers, 2020 European Conference on Computer Vision, pp. 213-229, (2020)

[5]

ZHOU Quan, NI Yinghao, MO Yuwei, Et al., FMA-DETR: a Transformer object detection method without encoder

[6]

LIAO Junshuang, TAN Qinghong, DETR with multi-granularity spatial attention and spatial prior supervision

[7]

YAO Z, AI J, LI B, Et al., Efficient DETR: improving end-to-end object detector with dense prior

[8]

DUAN K, BAI S, XIE L, Et al., CenterNet: keypoint triplets for object detection, 2019 IEEE/ CVF International Conference on Computer Vision, pp. 6568-6577, (2019)

[9]

ZHU X, SU W, LU L, Et al., Deformable DETR: deformable transformers for end-to-end object detection[ C], International Conference on Learning Representations, (2020)

[10]

LIN T Y, MAIRE M, BELONGIE S, Et al., Microsoft COCO: common objects in context, 13th European Conference on Computer Vision, pp. 740-755, (2014)

← 1 2 →