Transformers for Object Detection in Large Point Clouds

被引：1

作者：

Ruppel, Felicia ^{[1
,2
]}

Faion, Florian ^{[1
]}

Glaeser, Claudius ^{[1
]}

Dietmayer, Klaus ^{[2
]}

机构：

[1] Robert Bosch GmbH, Corp Res, D-71272 Renningen, Germany

[2] Ulm Univ, Inst Measurement Control & Microtechnol, Ulm, Germany

来源：

2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC) | 2022年

关键词：

D O I：

10.1109/ITSC55140.2022.9921840

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present TransLPC, a novel detection model for large point clouds that is based on a transformer architecture. While object detection with transformers has been an active field of research, it has proved difficult to apply such models to point clouds that span a large area, e.g. those that are common in autonomous driving, with lidar or radar data. TransLPC is able to remedy these issues: The structure of the transformer model is modified to allow for larger input sequence lengths, which are sufficient for large point clouds. Besides this, we propose a novel query refinement technique to improve detection accuracy, while retaining a memory-friendly number of transformer decoder queries. The queries are repositioned between layers, moving them closer to the bounding box they are estimating, in an efficient manner. This simple technique has a significant effect on detection accuracy, which is evaluated on the challenging nuScenes dataset on real-world lidar data. Besides this, the proposed method is compatible with existing transformer-based solutions that require object detection, e.g. for joint multi-object tracking and detection, and enables them to be used in conjunction with large point clouds.

引用

页码：832 / 838

页数：7

共 31 条

[11]

Hu RH, 2021, Arxiv, DOI arXiv:2102.10772

[12]

Lang A., 2019, PROC CVPR IEEE, p12 697

[13]

Li B, 2016, Arxiv, DOI arXiv:1608.07916

[14]

Li B, 2017, IEEE INT C INT ROBOT, P1513, DOI 10.1109/IROS.2017.8205955

[15] Voxel Transformer for 3D Object Detection [J].

Mao, Jiageng ;

Xue, Yujing ;

Niu, Minzhe ;

Bai, Haoyue ;

Feng, Jiashi ;

Liang, Xiaodan ;

Xu, Hang ;

Xu, Chunjing .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :3144-3153

[16]

Meinhardt T, 2022, Arxiv, DOI [arXiv:2101.02702, DOI 10.48550/ARXIV.2101.02702, 10.48550/arXiv:2101.02702]

[17] Conditional DETR for Fast Training Convergence [J].

Meng, Depu ;

Chen, Xiaokang ;

Fan, Zejia ;

Zeng, Gang ;

Li, Houqiang ;

Yuan, Yuhui ;

Sun, Lei ;

Wang, Jingdong .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :3631-3640

[18]

Misra I, 2021, Proceedings of the IEEE/CVF International Conference on Computer Vision, P2906

[19]

Noever D, 2020, Arxiv, DOI arXiv:2008.04057

[20] 3D Object Detection with Pointformer [J].

Pan, Xuran ;

Xia, Zhuofan ;

Song, Shiji ;

Li, Li Erran ;

Huang, Gao .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :7459-7468

← 1 2 3 4 →