Transformers for Object Detection in Large Point Clouds

被引:1
作者
Ruppel, Felicia [1 ,2 ]
Faion, Florian [1 ]
Glaeser, Claudius [1 ]
Dietmayer, Klaus [2 ]
机构
[1] Robert Bosch GmbH, Corp Res, D-71272 Renningen, Germany
[2] Ulm Univ, Inst Measurement Control & Microtechnol, Ulm, Germany
来源
2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC) | 2022年
关键词
D O I
10.1109/ITSC55140.2022.9921840
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present TransLPC, a novel detection model for large point clouds that is based on a transformer architecture. While object detection with transformers has been an active field of research, it has proved difficult to apply such models to point clouds that span a large area, e.g. those that are common in autonomous driving, with lidar or radar data. TransLPC is able to remedy these issues: The structure of the transformer model is modified to allow for larger input sequence lengths, which are sufficient for large point clouds. Besides this, we propose a novel query refinement technique to improve detection accuracy, while retaining a memory-friendly number of transformer decoder queries. The queries are repositioned between layers, moving them closer to the bounding box they are estimating, in an efficient manner. This simple technique has a significant effect on detection accuracy, which is evaluated on the challenging nuScenes dataset on real-world lidar data. Besides this, the proposed method is compatible with existing transformer-based solutions that require object detection, e.g. for joint multi-object tracking and detection, and enables them to be used in conjunction with large point clouds.
引用
收藏
页码:832 / 838
页数:7
相关论文
共 31 条
[11]  
Hu RH, 2021, Arxiv, DOI arXiv:2102.10772
[12]  
Lang A., 2019, PROC CVPR IEEE, p12 697
[13]  
Li B, 2016, Arxiv, DOI arXiv:1608.07916
[14]  
Li B, 2017, IEEE INT C INT ROBOT, P1513, DOI 10.1109/IROS.2017.8205955
[15]   Voxel Transformer for 3D Object Detection [J].
Mao, Jiageng ;
Xue, Yujing ;
Niu, Minzhe ;
Bai, Haoyue ;
Feng, Jiashi ;
Liang, Xiaodan ;
Xu, Hang ;
Xu, Chunjing .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :3144-3153
[16]  
Meinhardt T, 2022, Arxiv, DOI [arXiv:2101.02702, DOI 10.48550/ARXIV.2101.02702, 10.48550/arXiv:2101.02702]
[17]   Conditional DETR for Fast Training Convergence [J].
Meng, Depu ;
Chen, Xiaokang ;
Fan, Zejia ;
Zeng, Gang ;
Li, Houqiang ;
Yuan, Yuhui ;
Sun, Lei ;
Wang, Jingdong .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :3631-3640
[18]  
Misra I, 2021, Proceedings of the IEEE/CVF International Conference on Computer Vision, P2906
[19]  
Noever D, 2020, Arxiv, DOI arXiv:2008.04057
[20]   3D Object Detection with Pointformer [J].
Pan, Xuran ;
Xia, Zhuofan ;
Song, Shiji ;
Li, Li Erran ;
Huang, Gao .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :7459-7468