DeployFusion: A Deployable Monocular 3D Object Detection with Multi-Sensor Information Fusion in BEV for Edge Devices

被引:0
作者
Huang, Fei [1 ]
Liu, Shengshu [1 ]
Zhang, Guangqian [2 ]
Hao, Bingsen [3 ]
Xiang, Yangkai [3 ]
Yuan, Kun [3 ]
机构
[1] China Rd & Bridge Corp, Beijing 100010, Peoples R China
[2] Chongqing Seres Phoenix Intelligent Innovat Techno, Chongqing 400039, Peoples R China
[3] Chongqing Jiaotong Univ, Sch Mechatron & Vehicle Engn, Chongqing 400074, Peoples R China
关键词
multi-sensor information fusion; 3D object detection; BEV; feature fusion; model deployment;
D O I
10.3390/s24217007
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
To address the challenges of suboptimal remote detection and significant computational burden in existing multi-sensor information fusion 3D object detection methods, a novel approach based on Bird's-Eye View (BEV) is proposed. This method utilizes an enhanced lightweight EdgeNeXt feature extraction network, incorporating residual branches to address network degradation caused by the excessive depth of STDA encoding blocks. Meantime, deformable convolution is used to expand the receptive field and reduce computational complexity. The feature fusion module constructs a two-stage fusion network to optimize the fusion and alignment of multi-sensor features. This network aligns image features to supplement environmental information with point cloud features, thereby obtaining the final BEV features. Additionally, a Transformer decoder that emphasizes global spatial cues is employed to process the BEV feature sequence, enabling precise detection of distant small objects. Experimental results demonstrate that this method surpasses the baseline network, with improvements of 4.5% in the NuScenes detection score and 5.5% in average precision for detection objects. Finally, the model is converted and accelerated using TensorRT tools for deployment on mobile devices, achieving an inference time of 138 ms per frame on the Jetson Orin NX embedded platform, thus enabling real-time 3D object detection.
引用
收藏
页数:18
相关论文
共 40 条
[21]   Multi-modal 3D object detection by 2D-guided precision anchor proposal and multi-layer fusion [J].
Wu, Yi ;
Jiang, Xiaoyan ;
Fang, Zhijun ;
Gao, Yongbin ;
Fujita, Hamido .
APPLIED SOFT COMPUTING, 2021, 108
[22]   3D Object Detection Based on Feature Fusion of Point Cloud Sequences [J].
Zhai, Zhenyu ;
Wang, Qiantong ;
Pan, Zongxu ;
Hu, Wenlong ;
Hu, Yuxin .
2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, :1240-1245
[23]   Fault Diagnosis of Bearings Based on Multi-Sensor Information Fusion and 2D Convolutional Neural Network [J].
Wang, Jiaxing ;
Wang, Dazhi ;
Wang, Sihan ;
Li, Wenhui ;
Song, Keling .
IEEE ACCESS, 2021, 9 (09) :23717-23725
[24]   Fault Diagnosis of Industrial Robots Based on Multi-sensor Information Fusion and 1D Convolutional Neural Network [J].
Wang, Jiaxing ;
Wang, Dazhi ;
Wang, Xinghua .
PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, :3087-3091
[25]   Cross-Supervised LiDAR-Camera Fusion for 3D Object Detection [J].
Zuo, Chao Jie ;
Gu, Cao Yu ;
Guo, Yi Kun ;
Miao, Xiao Dong .
IEEE ACCESS, 2025, 13 :10447-10458
[26]   HCPVF: Hierarchical Cascaded Point-Voxel Fusion for 3D Object Detection [J].
Fan, Baojie ;
Zhang, Kexin ;
Tian, Jiandong .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) :8997-9009
[27]   Interactive Multi-Scale Fusion of 2D and 3D Features for Multi-Object Vehicle Tracking [J].
Wang, Guangming ;
Peng, Chensheng ;
Gu, Yingying ;
Zhang, Jinpeng ;
Wang, Hesheng .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (10) :10618-10627
[28]   Fuzzy Risk Evaluation in Failure Mode and Effects Analysis Using a D Numbers Based Multi-Sensor Information Fusion Method [J].
Deng, Xinyang ;
Jiang, Wen .
SENSORS, 2017, 17 (09)
[29]   F-Transformer: Point Cloud Fusion Transformer for Cooperative 3D Object Detection [J].
Wang, Jie ;
Luo, Guiyang ;
Yuan, Quan ;
Li, Jinglin .
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT I, 2022, 13529 :171-182
[30]   Freq-3DLane: 3D Lane Detection From Monocular Images via Frequency-Aware Feature Fusion [J].
Song, Yongchao ;
Bi, Jiping ;
Sun, Lijun ;
Liu, Zhaowei ;
Jiang, Yahong ;
Wang, Xuan .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025,