Understanding the Robustness of 3D Object Detection with Bird's-Eye-View Representations in Autonomous Driving

被引:21
|
作者
Zhu, Zijian [1 ]
Zhang, Yichi [2 ,5 ]
Chen, Hai [3 ]
Dong, Yinpeng [2 ]
Zhao, Shu [3 ]
Ding, Wenbo [4 ]
Zhong, Jiachen [4 ]
Zheng, Shibao [1 ]
机构
[1] Shanghai Jiao Tong Univ, Inst Image Commun & Network Engn, Shanghai, Peoples R China
[2] Tsinghua Univ, Inst AI, Dept Comp Sci & Tech, THBI Lab,BNRist Ctr, Beijing, Peoples R China
[3] Anhui Univ, Sch Comp Sci & Technol, Key Lab Intelligent Comp & Signal Proc,Minist Edu, Informat Mat & Intelligent Sensing Lab Anhui Prov, Hefei, Peoples R China
[4] SAIC Motor AI Lab, Shanghai, Peoples R China
[5] Zhongguancun Lab, Beijing, Peoples R China
关键词
D O I
10.1109/CVPR52729.2023.02069
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D object detection is an essential perception task in autonomous driving to understand the environments. The Bird's-Eye-View (BEV) representations have significantly improved the performance of 3D detectors with camera inputs on popular benchmarks. However, there still lacks a systematic understanding of the robustness of these vision-dependent BEV models, which is closely related to the safety of autonomous driving systems. In this paper, we evaluate the natural and adversarial robustness of various representative models under extensive settings, to fully understand their behaviors influenced by explicit BEV features compared with those without BEV. In addition to the classic settings, we propose a 3D consistent patch attack by applying adversarial patches in the 3D space to guarantee the spatiotemporal consistency, which is more realistic for the scenario of autonomous driving. With substantial experiments, we draw several findings: 1) BEV models tend to be more stable than previous methods under different natural conditions and common corruptions due to the expressive spatial representations; 2) BEV models are more vulnerable to adversarial noises, mainly caused by the redundant BEV features; 3) Camera-LiDAR fusion models have superior performance under different settings with multi-modal inputs, but BEV fusion model is still vulnerable to adversarial noises of both point cloud and image. These findings alert the safety issue in the applications of BEV detectors and could facilitate the development of more robust models.
引用
收藏
页码:21600 / 21610
页数:11
相关论文
共 50 条
  • [21] Monocular 3D Object Detection for Autonomous Driving
    Chen, Xiaozhi
    Kundu, Kaustav
    Zhang, Ziyu
    Ma, Huimin
    Fidler, Sanja
    Urtasun, Raquel
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2147 - 2156
  • [22] 3D Object Detection for Autonomous Driving: A Survey
    Qian, Rui
    Lai, Xin
    Li, Xirong
    PATTERN RECOGNITION, 2022, 130
  • [23] BEV-MODNet: Monocular Camera based Bird's Eye View Moving Object Detection for Autonomous Driving
    Rashed, Hazem
    Essam, Mariam
    Mohamed, Maha
    El Sallab, Ahmad
    Yogamani, Senthil
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 1503 - 1508
  • [24] Predicting Bird's-Eye-View Semantic Representations Using Correlated Context Learning
    Chen, Yongquan
    Fan, Weiming
    Zheng, Wenli
    Huang, Rui
    Yu, Jiahui
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (05): : 4718 - 4725
  • [25] BirdNet plus : End-to-End 3D Object Detection in LiDAR Bird's Eye View
    Barrera, Alejandro
    Guindel, Carlos
    Beltran, Jorge
    Garcia, Fernando
    2020 IEEE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2020,
  • [26] Fusing Bird's Eye View LIDAR Point Cloud and Front View Camera Image for 3D Object Detection
    Wang, Zining
    Zhan, Wei
    Tomizuka, Masayoshi
    2018 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2018, : 834 - 839
  • [27] BEVMOT: A multi-object detection and tracking method in Bird's-Eye-View via Spatiotemporal Transformers
    Cui, Chenglin
    Song, Ruiqi
    Li, XinQing
    Ai, Yunfeng
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 5446 - 5451
  • [28] Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View
    Wang, Shuo
    Zhao, Xinhai
    Xu, Hai-Ming
    Chen, Zehui
    Yu, Dameng
    Chang, Jiahao
    Yang, Zhen
    Zhao, Feng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 13333 - 13342
  • [29] 3D Object Detection for Autonomous Driving: A Practical Survey
    Ramajo-Ballester, Alvaro
    de la Escalera Hueso, Arturo
    Armingol Moreno, Jose Maria
    PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON VEHICLE TECHNOLOGY AND INTELLIGENT TRANSPORT SYSTEMS, VEHITS 2023, 2023, : 64 - 73
  • [30] 3D Object Detection for Autonomous Driving: A Comprehensive Survey
    Jiageng Mao
    Shaoshuai Shi
    Xiaogang Wang
    Hongsheng Li
    International Journal of Computer Vision, 2023, 131 : 1909 - 1963