3D Data Augmentation for Driving Scenes on Camera

被引:0
作者
Tong, Wenwen [1 ]
Xi, Jiangwei [1 ]
Li, Tianyu [2 ,3 ]
Li, Yang [2 ]
Deng, Hanming [1 ]
Dai, Bo [4 ]
Lu, Lewei [1 ]
Zhao, Hao [5 ]
Yan, Junchi [5 ]
Li, Hongyang [2 ,6 ]
机构
[1] SenseTime, Sha Tin, Hong Kong, Peoples R China
[2] Shanghai AI Lab, OpenDriveLab, Shanghai, Peoples R China
[3] Fudan Univ, Shanghai, Peoples R China
[4] Tsinghua Univ, Beijing, Peoples R China
[5] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[6] Univ Hong Kong, Pok Fu Lam, Hong Kong, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VI | 2025年 / 15036卷
关键词
Autonomous Driving; 3D Perception; Data Augmentation; NeRF;
D O I
10.1007/978-981-97-8508-7_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Driving scenes are extremely diverse and complicated that it is impossible to collect all cases with human effort alone. While data augmentation is an effective technique to enrich the training data, existing methods for camera data in autonomous driving applications are confined to the 2D image plane, which may not optimally increase data diversity in 3D real-world scenarios. To this end, we propose a 3D data augmentation approach termed Drive-3DAug, aiming at augmenting the driving scenes on camera in the 3D space. We first utilize Neural Radiance Field (NeRF) to reconstruct the 3D models of background and foreground objects. Then, augmented driving scenes can be obtained by placing the 3D objects with adapted location and orientation at the pre-defined valid region of backgrounds. As such, the training database could be effectively scaled up. However, the 3D object modeling is constrained to the image quality and the limited viewpoints. To overcome these problems, we modify the original NeRF by introducing a geometric rectified loss and a symmetric-aware training strategy. We evaluate our method for the camera-only monocular 3D detection task on the Waymo and nuScences datasets. The proposed data augmentation approach contributes to a gain of 1.7% and 1.4% in terms of detection accuracy, on Waymo and nuScences respectively. Furthermore, the constructed 3D models serve as digital driving assets and could be recycled for different detectors or other 3D perception tasks.
引用
收藏
页码:46 / 63
页数:18
相关论文
共 35 条
  • [1] M3D-RPN: Monocular 3D Region Proposal Network for Object Detection
    Brazil, Garrick
    Liu, Xiaoming
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9286 - 9295
  • [2] Caesar H, 2020, PROC CVPR IEEE, P11618, DOI 10.1109/CVPR42600.2020.01164
  • [3] Part-Aware Data Augmentation for 3D Object Detection in Point Cloud
    Choi, Jaeseok
    Song, Yeji
    Kwak, Nojun
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 3391 - 3397
  • [4] Cubuk ED, 2019, Arxiv, DOI arXiv:1805.09501
  • [5] Depth-supervised NeRF: Fewer Views and Faster Training for Free
    Deng, Kangle
    Liu, Andrew
    Zhu, Jun-Yan
    Ramanan, Deva
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12872 - 12881
  • [6] Dosovitskiy Alexey, 2017, PROC 1 ANN C ROBOT L, P1
  • [7] LiDAR-Aug: A General Rendering-based Augmentation Framework for 3D Object Detection
    Fang, Jin
    Zuo, Xinxin
    Zhou, Dingfu
    Jin, Shengze
    Wang, Sen
    Zhang, Liangjun
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4708 - 4718
  • [8] Plenoxels: Radiance Fields without Neural Networks
    Fridovich-Keil, Sara
    Yu, Alex
    Tancik, Matthew
    Chen, Qinhong
    Recht, Benjamin
    Kanazawa, Angjoo
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5491 - 5500
  • [9] Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
    Ghiasi, Golnaz
    Cui, Yin
    Srinivas, Aravind
    Qian, Rui
    Lin, Tsung-Yi
    Cubuk, Ekin D.
    Le, Quoc, V
    Zoph, Barret
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2917 - 2927
  • [10] 3D Packing for Self-Supervised Monocular Depth Estimation
    Guizilini, Vitor
    Ambrus, Rares
    Pillai, Sudeep
    Raventos, Allan
    Gaidon, Adrien
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2482 - 2491