SCNet3D: Rethinking the Feature Extraction Process of Pillar-Based 3D Object Detection

被引：0

作者：

Li, Junru ^{[1
,2
,3
]}

Wang, Zhiling ^{[1
,4
]}

Gong, Diancheng ^{[1
,2
,3
]}

Wang, Chunchun ^{[1
,5
]}

机构：

[1] Chinese Acad Sci, Hefei Inst Phys Sci, Hefei 230031, Peoples R China

[2] Univ Sci & Technol China, Hefei 230026, Peoples R China

[3] Anhui ShineAuto Autonomous Driving Technol Co Ltd, Hefei 230088, Peoples R China

[4] Anhui Engn Lab Intelligent Driving Technol & Appli, Hefei 230031, Peoples R China

[5] Anhui Univ Sci & Technol, Huainan 232002, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2025年 / 26卷 / 01期

关键词：

Feature extraction; Three-dimensional displays; Point cloud compression; Object detection; Pedestrians; Data mining; Automobiles; Shape; Finite element analysis; Accuracy; 3D object detection; point cloud; autonomous vehicle; SCNet3D;

D O I：

10.1109/TITS.2024.3486324

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

LiDAR-based 3D object detection is essential for autonomous driving. In order to extract information from sparse and unordered point cloud data, pillar-based methods make the data compact and orderly by converting point cloud into pseudo images. However, these methods suffer from limited feature extraction capabilities, and tend to lose key information during the conversion, leading to inferior detection accuracy than voxel-based or point-based methods especially for small objects. In this paper, we propose SCNet3D, a novel pillar-based method that tackles the challenges of feature enhancement, information preservation, and small target detection from the perspectives of features and data. We first introduce a Feature Enhancement Module (FEM), which uses the attention mechanism to weight features in three dimensions, and enhances 3D features from local to global layer by layer. Then, a STMod-Convolution Network (SCNet) is designed, which achieves sufficient feature extraction and fusion of BEV pseudo images through two channels, one for basic feature and one for advanced feature. Moreover, a Shape and Distance Aware Data Augmentation (SDAA) approach is proposed to add more samples to the point cloud while maintaining the original shape and distance of the samples during the training process. Extensive experiments demonstrate that our SCNet3D has superior performance and excellent robustness. Remarkably, SCNet3D achieves the AP of 82.35% in the moderate Car category, 44.64% in the moderate Pedestrian category and 67.55% in the moderate Cyclist category on the KITTI test split in 3D detection benchmark, outperforming many state-of-the-art 3D detectors.

引用

页码：770 / 784

页数：15

共 56 条

[1] PillarGrid: Deep Learning-based Cooperative Perception for 3D Object Detection from Onboard-Roadside LiDAR [J].

Bai, Zhengwei ;

Wu, Guoyuan ;

Barth, Matthew J. ;

Liu, Yongkang ;

Sisbot, Emrah Akin ;

Oguchi, Kentaro .

2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2022, :1743-1749

[2]

Caesar H, 2020, PROC CVPR IEEE, P11618, DOI 10.1109/CVPR42600.2020.01164

[3]

Carion N., 2020, EUROPEAN C COMPUTER, P213

[4]

Chen YL, 2019, IEEE I CONF COMP VIS, P9774, DOI [10.1109/iccv.2019.00987, 10.1109/ICCV.2019.00987]

[5]

Deng JJ, 2021, AAAI CONF ARTIF INTE, V35, P1201

[6] Embracing Single Stride 3D Object Detector with Sparse Transformer [J].

Fan, Lue ;

Pang, Ziqi ;

Zhang, Tianyuan ;

Wang, Yu-Xiong ;

Zhao, Hang ;

Wang, Feng ;

Wang, Naiyan ;

Zhang, Zhaoxiang .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :8448-8458

[7]

Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074

[8] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[9] M3DETR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers [J].

Guan, Tianrui ;

Wang, Jun ;

Lan, Shiyi ;

Chandra, Rohan ;

Wu, Zuxuan ;

Davis, Larry ;

Manocha, Dinesh .

2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :2293-2303

[10]

He CH, 2020, PROC CVPR IEEE, P11870, DOI 10.1109/CVPR42600.2020.01189

← 1 2 3 4 5 6 →