GRAMO: geometric resampling augmentation for monocular 3D object detection

被引:0
|
作者
He Guan
Chunfeng Song
Zhaoxiang Zhang
机构
[1] University of Chinese Academy of Sciences,School of Artificial Intelligence
[2] Institute of Automation Chinese Academy of Sciences,Center for Research on Intelligent Perception and Computing, State Key Laboratory of Multimodal Artificial Intelligence Systems
来源
Frontiers of Computer Science | 2024年 / 18卷
关键词
3D detection; monocular; augmentation; geometry;
D O I
暂无
中图分类号
学科分类号
摘要
Data augmentation is widely recognized as an effective means of bolstering model robustness. However, when applied to monocular 3D object detection, non-geometric image augmentation neglects the critical link between the image and physical space, resulting in the semantic collapse of the extended scene. To address this issue, we propose two geometric-level data augmentation operators named Geometric-Copy-Paste (Geo-CP) and Geometric-Crop-Shrink (Geo-CS). Both operators introduce geometric consistency based on the principle of perspective projection, complementing the options available for data augmentation in monocular 3D. Specifically, Geo-CP replicates local patches by reordering object depths to mitigate perspective occlusion conflicts, and Geo-CS re-crops local patches for simultaneous scaling of distance and scale to unify appearance and annotation. These operations ameliorate the problem of class imbalance in the monocular paradigm by increasing the quantity and distribution of geometrically consistent samples. Experiments demonstrate that our geometric-level augmentation operators effectively improve robustness and performance in the KITTI and Waymo monocular 3D detection benchmarks.
引用
收藏
相关论文
共 50 条
  • [1] GRAMO: geometric resampling augmentation for monocular 3D object detection
    Guan, He
    Song, Chunfeng
    Zhang, Zhaoxiang
    FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (05)
  • [2] Exploring Geometric Consistency for Monocular 3D Object Detection
    Lian, Qing
    Ye, Botao
    Xu, Ruijia
    Yao, Weilong
    Zhang, Tong
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1675 - 1684
  • [3] Monocular Object Detection Using 3D Geometric Primitives
    Carr, Peter
    Sheikh, Yaser
    Matthews, Iain
    COMPUTER VISION - ECCV 2012, PT I, 2012, 7572 : 864 - 878
  • [4] Monocular 3D Object Detection via Geometric Reasoning on Keypoints
    Barabanau, Ivan
    Artemov, Alexey
    Burnaev, Evgeny
    Murashkin, Vyacheslav
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 652 - 659
  • [5] MonoSample: Synthetic 3D Data Augmentation Method in Monocular 3D Object Detection
    Qiao, Junchao
    Liu, Biao
    Yang, Jiaqi
    Wang, Baohua
    Xiu, Sanmu
    Du, Xin
    Nie, Xiaobo
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (08): : 7326 - 7332
  • [6] Monocular 3D Object Detection With Sequential Feature Association and Depth Hint Augmentation
    Gao, Tianze
    Pan, Huihui
    Gao, Huijun
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2022, 7 (02): : 240 - 250
  • [7] Aerial Monocular 3D Object Detection
    Hu, Yue
    Fang, Shaoheng
    Xie, Weidi
    Chen, Siheng
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) : 1959 - 1966
  • [8] Disentangling Monocular 3D Object Detection
    Simonelli, Andrea
    Bulo, Samuel Rota
    Porzi, Lorenzo
    Lopez-Antequera, Manuel
    Kontschieder, Peter
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1991 - 1999
  • [9] Monocular 3D Object Detection for Autonomous Driving
    Chen, Xiaozhi
    Kundu, Kaustav
    Zhang, Ziyu
    Ma, Huimin
    Fidler, Sanja
    Urtasun, Raquel
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2147 - 2156
  • [10] Dimension Embeddings for Monocular 3D Object Detection
    Zhang, Yunpeng
    Zheng, Wenzhao
    Zhu, Zheng
    Huang, Guan
    Du, Dalong
    Zhou, Jie
    Lu, Jiwen
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1579 - 1588