Multimodal 3D Object Detection from Simulated Pretraining

被引：11

作者：

Brekke, Asmund ^{[1
]}

Vatsendvik, Fredrik ^{[1
]}

Lindseth, Frank ^{[1
]}

机构：

[1] Norwegian Univ Sci & Technol, Trondheim, Norway

来源：

NORDIC ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT | 2019年 / 1056卷

关键词：

Autonomous driving; Simulated data; 3D object detection; CARLA; KITTI; AVOD-FPN; LIDAR; Sensor fusion;

D O I：

10.1007/978-3-030-35664-4_10

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The need for simulated data in autonomous driving applications has become increasingly important, both for validation of pre-trained models and for training new models. In order for these models to generalize to real-world applications, it is critical that the underlying dataset contains a variety of driving scenarios and that simulated sensor readings closely mimics real-world sensors. We present the Carla Automated Dataset Extraction Tool (CADET), a novel tool for generating training data from the CARLA simulator to be used in autonomous driving research. The tool is able to export high-quality, synchronized LIDAR and camera data with object annotations, and offers configuration to accurately reflect a real-life sensor array. Furthermore, we use this tool to generate a dataset consisting of 10 000 samples and use this dataset in order to train the 3D object detection network AVOD-FPN, with finetuning on the KITTI dataset in order to evaluate the potential for effective pretraining. We also present two novel LIDAR feature map configurations in Bird's Eye View for use with AVOD-FPN that can be easily modified. These configurations are tested on the KITTI and CADET datasets in order to evaluate their performance as well as the usability of the simulated dataset for pretraining. Although insufficient to fully replace the use of real world data, and generally not able to exceed the performance of systems fully trained on real data, our results indicate that simulated data can considerably reduce the amount of training on real data required to achieve satisfactory levels of accuracy.

引用

页码：102 / 113

页数：12

共 50 条

[1] Multimodal Object Query Initialization for 3D Object Detection
van Geerenstein, Mathijs R.
Ruppel, Felicia
Dietmayers, Klaus
Gavrila, Dariu M.
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 12484 - 12491
[2] Multimodal 3D Histogram for Moving Object Detection
Mukherjee, Dibyendu
Saha, Ashirbani
Wu, Q. M. Jonathan
Jiang, Wei
2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 2397 - 2402
[3] Virtual Sparse Convolution for Multimodal 3D Object Detection
Wu, Hai
Wen, Chenglu
Shi, Shaoshuai
Li, Xin
Wang, Cheng
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21653 - 21662
[4] Multimodal Transformer for Automatic 3D Annotation and Object Detection
Liu, Chang
Qian, Xiaoyan
Huang, Binxiao
Qi, Xiaojuan
Lam, Edmund
Tan, Siew-Chong
Wong, Ngai
COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 657 - 673
[5] FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection
Xu, Shaoqing
Zhou, Dingfu
Fang, Jin
Yin, Junbo
Bin, Zhou
Zhang, Liangjun
2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 3047 - 3054
[6] Real-Time Multimodal 3D Object Detection with Transformers
Liu, Hengsong
Duan, Tongle
WORLD ELECTRIC VEHICLE JOURNAL, 2024, 15 (07):
[7] MVX-Net: Multimodal VoxelNet for 3D Object Detection
Sindagi, Vishwanath A.
Zhou, Yin
Tuzel, Oncel
2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 7276 - 7282
[8] LiDAR-guided Geometric Pretraining for Vision-Centric 3D Object Detection
Huang, Linyan
Wang, Huijie
Zeng, Jia
Zhang, Shengchuan
Cao, Liujuan
Yan, Junchi
Li, Hongyang
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
[9] Multimodal 3D Object Detection Based on Sparse Interaction in Internet of Vehicles
Li, Hui
Ge, Tongao
Bai, Keqiang
Nie, Gaofeng
Xu, Lingwei
Ai, Xiaoxue
Cao, Song
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (02) : 2174 - 2186
[10] VirPNet: A Multimodal Virtual Point Generation Network for 3D Object Detection
Wang, Lin
Sun, Shiliang
Zhao, Jing
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10597 - 10609

← 1 2 3 4 5 →