GRAMO: geometric resampling augmentation for monocular 3D object detection

被引:0
作者
He Guan
Chunfeng Song
Zhaoxiang Zhang
机构
[1] University of Chinese Academy of Sciences,School of Artificial Intelligence
[2] Institute of Automation Chinese Academy of Sciences,Center for Research on Intelligent Perception and Computing, State Key Laboratory of Multimodal Artificial Intelligence Systems
来源
Frontiers of Computer Science | 2024年 / 18卷
关键词
3D detection; monocular; augmentation; geometry;
D O I
暂无
中图分类号
学科分类号
摘要
Data augmentation is widely recognized as an effective means of bolstering model robustness. However, when applied to monocular 3D object detection, non-geometric image augmentation neglects the critical link between the image and physical space, resulting in the semantic collapse of the extended scene. To address this issue, we propose two geometric-level data augmentation operators named Geometric-Copy-Paste (Geo-CP) and Geometric-Crop-Shrink (Geo-CS). Both operators introduce geometric consistency based on the principle of perspective projection, complementing the options available for data augmentation in monocular 3D. Specifically, Geo-CP replicates local patches by reordering object depths to mitigate perspective occlusion conflicts, and Geo-CS re-crops local patches for simultaneous scaling of distance and scale to unify appearance and annotation. These operations ameliorate the problem of class imbalance in the monocular paradigm by increasing the quantity and distribution of geometrically consistent samples. Experiments demonstrate that our geometric-level augmentation operators effectively improve robustness and performance in the KITTI and Waymo monocular 3D detection benchmarks.
引用
收藏
相关论文
共 50 条
  • [21] Monocular trajectory intersection method for 3D motion measurement of a point target
    QiFeng Yu
    Yang Shang
    Jian Zhou
    XiaoHu Zhang
    LiChun Li
    Science in China Series E: Technological Sciences, 2009, 52 : 3454 - 3463
  • [22] Monocular 3D reconstruction of sail flying shape using passive markers
    Luiz Maciel
    Ricardo Marroquim
    Marcelo Vieira
    Kevyn Ribeiro
    Alexandre Alho
    Machine Vision and Applications, 2021, 32
  • [23] Monocular trajectory intersection method for 3D motion measurement of a point target
    Yu QiFeng
    Shang Yang
    Zhou Jian
    Zhang XiaoHu
    Li LiChun
    SCIENCE IN CHINA SERIES E-TECHNOLOGICAL SCIENCES, 2009, 52 (12): : 3454 - 3463
  • [24] METHODS FOR GEOMETRIC DATA VALIDATION OF 3D CITY MODELS
    Wagner, D.
    Alam, N.
    Wewetzer, M.
    Pries, M.
    Coors, V.
    INTERNATIONAL CONFERENCE ON SENSORS & MODELS IN REMOTE SENSING & PHOTOGRAMMETRY, 2015, 41 (W5): : 729 - 735
  • [25] Robust 3D model watermarking against geometric transformation
    Sun, SS
    Pan, ZG
    Li, L
    Shi, JY
    CAD/ GRAPHICS TECHNOLOGY AND ITS APPLICATIONS, PROCEEDINGS, 2003, : 87 - +
  • [26] Monocular direct visual servoing for regulation of manipulators moving in the 3D Cartesian space
    Kelly, Rafael
    Bugarin, Eusebio
    Cervantes, Ilse
    Alvarez-Rarnirez, Jose
    PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 1787 - 1787
  • [27] Learning with privileged stereo knowledge for monocular absolute 3D human pose estimation
    Bian, Cunling
    Lu, Weigang
    Feng, Wei
    Wang, Song
    PATTERN RECOGNITION LETTERS, 2025, 189 : 143 - 149
  • [28] Frame-Padded Multiscale Transformer for Monocular 3D Human Pose Estimation
    Zhong, Yuanhong
    Yang, Guangxia
    Zhong, Daidi
    Yang, Xun
    Wang, Shanshan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6191 - 6201
  • [29] Feasibility of 3D Body Tracking from Monocular 2D Video Feeds in Musculoskeletal Telerehabilitation
    Clemente, Carolina
    Chambel, Goncalo
    Silva, Diogo C. F.
    Montes, Antonio Mesquita
    Pinto, Joana F.
    da Silva, Hugo Placido
    SENSORS, 2024, 24 (01)
  • [30] Unsupervised 3D Object Segmentation of Point Clouds by Geometry Consistency
    Song, Ziyang
    Yang, Bo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 8459 - 8473