Context-Aware 3D Object Detection From a Single Image in Autonomous Driving

被引:8
作者
Zhou, Dingfu [1 ,2 ]
Song, Xibin [1 ,2 ]
Fang, Jin [1 ,2 ]
Dai, Yuchao [3 ]
Li, Hongdong [4 ]
Zhang, Liangjun [1 ,2 ]
机构
[1] Baidu Res, Robot & Autonomous Driving Lab, Beijing 100085, Peoples R China
[2] Natl Engn Lab Deep Learning Technol & Applicat, Beijing 100193, Peoples R China
[3] Northwestern Polytech Univ, Sch Elect & Informat, Xian 710060, Peoples R China
[4] Australian Natl Univ, Coll Engn & Comp Sci, Canberra, ACT 0200, Australia
基金
澳大利亚研究理事会; 中国国家自然科学基金;
关键词
Three-dimensional displays; Object detection; Training; Feature extraction; Task analysis; Sensors; Detectors; Monocular 3D object detection; context-aware feature aggregation; self-attention; RECOGNITION; MODEL;
D O I
10.1109/TITS.2022.3154022
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Camera sensors have been widely used in Driver-Assistance and Autonomous Driving Systems due to their rich texture information. Recently, with the development of deep learning techniques, many approaches have been proposed to detect objects in 3D from a single frame, however, there is still much room for improvement. In this paper, we generally review the recently proposed state-of-the-art monocular-based 3D object detection approaches first. Based on the analysis of the disadvantage of previous center-based frameworks, a novel feature aggregation strategy has been proposed to boost the 3D object detection by exploring the context information. Specifically, an Instance-Guided Spatial Attention (IGSA) module is proposed to collect the local instance information and the Channel-Wise Feature Attention (CWFA) module is employed for aggregating the global context information. In addition, an instance-guided object regression strategy is also proposed to alleviate the influence of center location prediction uncertainty in the inference process. Finally, the proposed approach has been verified on the public 3D object detection benchmark. The experimental results show that the proposed approach can significantly boost the performance of the baseline method on both 3D detection and 2D Bird's-Eye View among all three categories. Furthermore, our method outperforms all the monocular-based methods (even these trained with depth as auxiliary inputs) and achieves state-of-the-art performance on the KITTI benchmark.
引用
收藏
页码:18568 / 18580
页数:13
相关论文
共 50 条
  • [31] Spatial Context-Aware Object-Attentional Network for Multi-Label Image Classification
    Zhang, Jialu
    Ren, Jianfeng
    Zhang, Qian
    Liu, Jiang
    Jiang, Xudong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3000 - 3012
  • [32] 3D-DFM: Anchor-Free Multimodal 3-D Object Detection With Dynamic Fusion Module for Autonomous Driving
    Lin, Chunmian
    Tian, Daxin
    Duan, Xuting
    Zhou, Jianshan
    Zhao, Dezong
    Cao, Dongpu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (12) : 10812 - 10822
  • [33] Context-Aware 3D Point Cloud Semantic Segmentation With Plane Guidance
    Weng, Tingyu
    Xiao, Jun
    Yan, Feilong
    Jiang, Haiyong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 6653 - 6664
  • [34] Collaborative 3D Object Detection for Autonomous Vehicles via Learnable Communications
    Wang, J.
    Zeng, Y.
    Gong, Y.
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (09) : 9804 - 9816
  • [35] Discriminative context-aware network for camouflaged object detection
    Ike, Chidiebere Somadina
    Muhammad, Nazeer
    Bibi, Nargis
    Alhazmi, Samah
    Eoghan, Furey
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7
  • [36] Class-aware single image to 3D object translational autoencoder
    Turhan, Ceren Guzel
    Bilge, Hasan Sakir
    IET IMAGE PROCESSING, 2020, 14 (13) : 3046 - 3053
  • [37] MKD-Cooper: Cooperative 3D Object Detection for Autonomous Driving via Multi-Teacher Knowledge Distillation
    Li, Zhiyuan
    Liang, Huawei
    Wang, Hanqi
    Zhao, Mingzhuo
    Wang, Jian
    Zheng, Xiaokun
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 1490 - 1500
  • [38] Transformer3D-Det: Improving 3D Object Detection by Vote Refinement
    Zhao, Lichen
    Guo, Jinyang
    Xu, Dong
    Sheng, Lu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (12) : 4735 - 4746
  • [39] Contrastive Context-Aware Learning for 3D High-Fidelity Mask Face Presentation Attack Detection
    Liu, Ajian
    Zhao, Chenxu
    Yu, Zitong
    Wan, Jun
    Su, Anyang
    Liu, Xing
    Tan, Zichang
    Escalera, Sergio
    Xing, Junliang
    Liang, Yanyan
    Guo, Guodong
    Lei, Zhen
    Li, Stan Z.
    Zhang, Du
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2022, 17 : 2497 - 2507
  • [40] RangeLVDet: Boosting 3D Object Detection in LIDAR With Range Image and RGB Image
    Zhang, Zehan
    Liang, Zhidong
    Zhang, Ming
    Zhao, Xian
    Li, Hao
    Yang, Ming
    Tan, Wenming
    Pu, Shiliang
    IEEE SENSORS JOURNAL, 2022, 22 (02) : 1391 - 1403