Multimodal 3D Object Detection from Simulated Pretraining

被引:11
|
作者
Brekke, Asmund [1 ]
Vatsendvik, Fredrik [1 ]
Lindseth, Frank [1 ]
机构
[1] Norwegian Univ Sci & Technol, Trondheim, Norway
关键词
Autonomous driving; Simulated data; 3D object detection; CARLA; KITTI; AVOD-FPN; LIDAR; Sensor fusion;
D O I
10.1007/978-3-030-35664-4_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The need for simulated data in autonomous driving applications has become increasingly important, both for validation of pre-trained models and for training new models. In order for these models to generalize to real-world applications, it is critical that the underlying dataset contains a variety of driving scenarios and that simulated sensor readings closely mimics real-world sensors. We present the Carla Automated Dataset Extraction Tool (CADET), a novel tool for generating training data from the CARLA simulator to be used in autonomous driving research. The tool is able to export high-quality, synchronized LIDAR and camera data with object annotations, and offers configuration to accurately reflect a real-life sensor array. Furthermore, we use this tool to generate a dataset consisting of 10 000 samples and use this dataset in order to train the 3D object detection network AVOD-FPN, with finetuning on the KITTI dataset in order to evaluate the potential for effective pretraining. We also present two novel LIDAR feature map configurations in Bird's Eye View for use with AVOD-FPN that can be easily modified. These configurations are tested on the KITTI and CADET datasets in order to evaluate their performance as well as the usability of the simulated dataset for pretraining. Although insufficient to fully replace the use of real world data, and generally not able to exceed the performance of systems fully trained on real data, our results indicate that simulated data can considerably reduce the amount of training on real data required to achieve satisfactory levels of accuracy.
引用
收藏
页码:102 / 113
页数:12
相关论文
共 50 条
  • [1] Multimodal Object Query Initialization for 3D Object Detection
    van Geerenstein, Mathijs R.
    Ruppel, Felicia
    Dietmayers, Klaus
    Gavrila, Dariu M.
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 12484 - 12491
  • [2] Multimodal 3D Histogram for Moving Object Detection
    Mukherjee, Dibyendu
    Saha, Ashirbani
    Wu, Q. M. Jonathan
    Jiang, Wei
    2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 2397 - 2402
  • [3] Virtual Sparse Convolution for Multimodal 3D Object Detection
    Wu, Hai
    Wen, Chenglu
    Shi, Shaoshuai
    Li, Xin
    Wang, Cheng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21653 - 21662
  • [4] Multimodal Transformer for Automatic 3D Annotation and Object Detection
    Liu, Chang
    Qian, Xiaoyan
    Huang, Binxiao
    Qi, Xiaojuan
    Lam, Edmund
    Tan, Siew-Chong
    Wong, Ngai
    COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 657 - 673
  • [5] FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection
    Xu, Shaoqing
    Zhou, Dingfu
    Fang, Jin
    Yin, Junbo
    Bin, Zhou
    Zhang, Liangjun
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 3047 - 3054
  • [6] Real-Time Multimodal 3D Object Detection with Transformers
    Liu, Hengsong
    Duan, Tongle
    WORLD ELECTRIC VEHICLE JOURNAL, 2024, 15 (07):
  • [7] MVX-Net: Multimodal VoxelNet for 3D Object Detection
    Sindagi, Vishwanath A.
    Zhou, Yin
    Tuzel, Oncel
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 7276 - 7282
  • [8] LiDAR-guided Geometric Pretraining for Vision-Centric 3D Object Detection
    Huang, Linyan
    Wang, Huijie
    Zeng, Jia
    Zhang, Shengchuan
    Cao, Liujuan
    Yan, Junchi
    Li, Hongyang
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
  • [9] Multimodal 3D Object Detection Based on Sparse Interaction in Internet of Vehicles
    Li, Hui
    Ge, Tongao
    Bai, Keqiang
    Nie, Gaofeng
    Xu, Lingwei
    Ai, Xiaoxue
    Cao, Song
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (02) : 2174 - 2186
  • [10] VirPNet: A Multimodal Virtual Point Generation Network for 3D Object Detection
    Wang, Lin
    Sun, Shiliang
    Zhao, Jing
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10597 - 10609