DeployFusion: A Deployable Monocular 3D Object Detection with Multi-Sensor Information Fusion in BEV for Edge Devices

被引：0

作者：

Huang, Fei ^{[1
]}

Liu, Shengshu ^{[1
]}

Zhang, Guangqian ^{[2
]}

Hao, Bingsen ^{[3
]}

Xiang, Yangkai ^{[3
]}

Yuan, Kun ^{[3
]}

机构：

[1] China Rd & Bridge Corp, Beijing 100010, Peoples R China

[2] Chongqing Seres Phoenix Intelligent Innovat Techno, Chongqing 400039, Peoples R China

[3] Chongqing Jiaotong Univ, Sch Mechatron & Vehicle Engn, Chongqing 400074, Peoples R China

来源：

SENSORS | 2024年 / 24卷 / 21期

关键词：

multi-sensor information fusion; 3D object detection; BEV; feature fusion; model deployment;

D O I：

10.3390/s24217007

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

To address the challenges of suboptimal remote detection and significant computational burden in existing multi-sensor information fusion 3D object detection methods, a novel approach based on Bird's-Eye View (BEV) is proposed. This method utilizes an enhanced lightweight EdgeNeXt feature extraction network, incorporating residual branches to address network degradation caused by the excessive depth of STDA encoding blocks. Meantime, deformable convolution is used to expand the receptive field and reduce computational complexity. The feature fusion module constructs a two-stage fusion network to optimize the fusion and alignment of multi-sensor features. This network aligns image features to supplement environmental information with point cloud features, thereby obtaining the final BEV features. Additionally, a Transformer decoder that emphasizes global spatial cues is employed to process the BEV feature sequence, enabling precise detection of distant small objects. Experimental results demonstrate that this method surpasses the baseline network, with improvements of 4.5% in the NuScenes detection score and 5.5% in average precision for detection objects. Finally, the model is converted and accelerated using TensorRT tools for deployment on mobile devices, achieving an inference time of 138 ms per frame on the Jetson Orin NX embedded platform, thus enabling real-time 3D object detection.

引用

页数：18

共 40 条

[31] High-accuracy road surface condition detection through multi-sensor information fusion based on WOA-BP neural network [J].

Jiang, Jingqi ;

Xu, Gaobin ;

Wang, Huanzhang ;

Yang, Zhaohui ;

Sun, Baichuan ;

Guan, Cunhe ;

Feng, Jianguo ;

Ma, Yuanming ;

Chen, Xing .

SENSORS AND ACTUATORS A-PHYSICAL, 2024, 378

[32] Multi-Range View Aggregation Network With Vision Transformer Feature Fusion for 3D Object Retrieval [J].

Lin, Dongyun ;

Li, Yiqun ;

Cheng, Yi ;

Prasad, Shitala ;

Guo, Aiyuan ;

Cao, Yanpeng .

IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :9108-9119

[33] Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object Detection [J].

Zhai, Zhenyu ;

Wang, Qiantong ;

Pan, Zongxu ;

Gao, Zhentong ;

Hu, Wenlong .

SENSORS, 2022, 22 (19)

[34] RI-Fusion: 3D Object Detection Using Enhanced Point Features With Range-Image Fusion for Autonomous Driving [J].

Zhang, Xinyu ;

Wang, Li ;

Zhang, Guoxin ;

Lan, Tianwei ;

Zhang, Haoming ;

Zhao, Lijun ;

Li, Jun ;

Zhu, Lei ;

Liu, Huaping .

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72

[35] RI-Fusion: 3D Object Detection Using Enhanced Point Features With Range-Image Fusion for Autonomous Driving [J].

Zhang, Xinyu ;

Wang, Li ;

Zhang, Guoxin ;

Lan, Tianwei ;

Zhang, Haoming ;

Zhao, Lijun ;

Li, Jun ;

Zhu, Lei ;

Liu, Huaping .

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72

[36] RI-Fusion: 3D Object Detection Using Enhanced Point Features With Range-Image Fusion for Autonomous Driving [J].

Zhang, Xinyu ;

Wang, Li ;

Zhang, Guoxin ;

Lan, Tianwei ;

Zhang, Haoming ;

Zhao, Lijun ;

Li, Jun ;

Zhu, Lei ;

Liu, Huaping .

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72

[37] MSSA: Multi-Representation Semantics-Augmented Set Abstraction for 3D Object Detection [J].

Liu, Huaijin ;

Du, Jixiang ;

Zhang, Yong ;

Zhang, Hongbo ;

Zeng, Jiandian .

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (10)

[38] NCFDet: Enhanced point cloud features using the neural collapse phenomenon in multimodal fusion for 3D object detection [J].

Xu, Yaming ;

Xu, Minglei ;

Wang, Yan ;

Li, Boliang .

JOURNAL OF COMPUTATIONAL DESIGN AND ENGINEERING, 2025, 12 (01) :300-311

[39] 3D object detection based on sparse convolution neural network and feature fusion for autonomous driving in smart cities [J].

Wang, Lei ;

Fan, Xiaoyun ;

Chen, Jiahao ;

Cheng, Jun ;

Tan, Jun ;

Ma, Xiaoliang .

SUSTAINABLE CITIES AND SOCIETY, 2020, 54 (54)

[40] 3D Sensor Based Pedestrian Detection by Integrating Improved HHA Encoding and Two-Branch Feature Fusion [J].

Tan, Fang ;

Xia, Zhaoqiang ;

Ma, Yupeng ;

Feng, Xiaoyi .

REMOTE SENSING, 2022, 14 (03)

← 1 2 3 4 →