STFNET: Sparse Temporal Fusion for 3D Object Detection in LiDAR Point Cloud

被引：0

作者：

Meng, Xin ^{[1
]}

Zhou, Yuan ^{[2
]}

Ma, Jun ^{[1
]}

Jiang, Fangdi ^{[1
]}

Qi, Yongze ^{[1
]}

Wang, Cui ^{[3
]}

Kim, Jonghyuk ^{[4
]}

Wang, Shifeng ^{[1
,3
]}

机构：

[1] Changchun Univ Sci & Technol, Sch Optoelect Engn, Changchun 130022, Peoples R China

[2] Leapmotor, Hangzhou 310000, Peoples R China

[3] Changchun Univ Sci & Technol, Zhongshan Inst, Zhongshan 528400, Peoples R China

[4] Naif Arab Univ Secur Sci, Ctr Excellence Cybercrimes & Digital Forens, Riyadh 11452, Saudi Arabia

来源：

IEEE SENSORS JOURNAL | 2025年 / 25卷 / 03期

关键词：

Feature extraction; Three-dimensional displays; Point cloud compression; Object detection; Laser radar; History; Sensors; Proposals; Heating systems; Fuses; 3D object detection; autonomous vehicle; LiDAR; point cloud;

D O I：

10.1109/JSEN.2024.3519603

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In autonomous driving and robotics, 3D object detection using LiDAR point clouds is a critical task. However, existing single-frame 3D object detection methods face challenges such as noise, occlusions, and sparsity, which degrade detection performance. To address these, we propose the sparse temporal fusion network (STFNet), which leverages multiframe historical information to improve 3D object detection accuracy. The contribution of STFNet contains three core modules: multihistory feature alignment module (MFAM), sparse feature extraction module (SFEM), and temporal fusion transformer (TFformer). MFAM: Ego-motion is used for compensation to align frames, establishing correlations between adjacent frames along the temporal dimension. SFEM: Sparse extraction is performed on features from different time steps to obtain key features within the time series. TFformer: The advanced temporal fusion attention mechanism is introduced to facilitate deep interactions between the current and historical frames. We validated the effectiveness of STFNet on the nuScenes dataset, achieving 71.8% NuScenes detection score (NDS) and 67.0% mean average precision (mAP). Compared to the benchmark method, our method improves 1.6% NDS and 1.5% mAP. Extensive experiments demonstrate that STFNet significantly outperforms most existing methods, highlighting the superiority and generalizability of our approach.

引用

页码：5866 / 5877

页数：12

共 37 条

[11] Hu YH, 2022, AAAI CONF ARTIF INTE, P969
[12] Koh J, 2023, AAAI CONF ARTIF INTE, P1179
[13] PointPillars: Fast Encoders for Object Detection from Point Clouds
Lang, Alex H.
Vora, Sourabh
Caesar, Holger
Zhou, Lubing
Yang, Jiong
Beijbom, Oscar
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12689 - 12697
[14] D-Align: Dual Query Co-attention Network for 3D Object Detection Based on Multi-frame Point Cloud Sequence
Lee, Junhyung
Koh, Junho
Lee, Youngwoo
Choi, Jun Won
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 9238 - 9244
[15] PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds
Li, Jinyu
Luo, Chenxu
Yang, Xiaodong
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 17567 - 17576
[16] LinK: Linear Kernel for LiDAR-based 3D Perception
Lu, Tao
Ding, Xiang
Liu, Haisong
Wu, Gangshan
Wang, Limin
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1105 - 1115
[17] McCrae S, 2020, IEEE IMAGE PROC, P2661, DOI [10.1109/icip40778.2020.9191134, 10.1109/ICIP40778.2020.9191134]
[18] Offboard 3D Object Detection from Point Cloud Sequences
Qi, Charles R.
Zhou, Yin
Najibi, Mahyar
Sun, Pei
Khoa Vo
Deng, Boyang
Anguelov, Dragomir
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6130 - 6140
[19] Qi Chen, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12366), P68, DOI 10.1007/978-3-030-58589-1_5
[20] Rui Huang, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12363), P266, DOI 10.1007/978-3-030-58523-5_16

← 1 2 3 4 →