GRAMO: geometric resampling augmentation for monocular 3D object detection

被引:0
作者
He Guan
Chunfeng Song
Zhaoxiang Zhang
机构
[1] University of Chinese Academy of Sciences,School of Artificial Intelligence
[2] Institute of Automation Chinese Academy of Sciences,Center for Research on Intelligent Perception and Computing, State Key Laboratory of Multimodal Artificial Intelligence Systems
来源
Frontiers of Computer Science | 2024年 / 18卷
关键词
3D detection; monocular; augmentation; geometry;
D O I
暂无
中图分类号
学科分类号
摘要
Data augmentation is widely recognized as an effective means of bolstering model robustness. However, when applied to monocular 3D object detection, non-geometric image augmentation neglects the critical link between the image and physical space, resulting in the semantic collapse of the extended scene. To address this issue, we propose two geometric-level data augmentation operators named Geometric-Copy-Paste (Geo-CP) and Geometric-Crop-Shrink (Geo-CS). Both operators introduce geometric consistency based on the principle of perspective projection, complementing the options available for data augmentation in monocular 3D. Specifically, Geo-CP replicates local patches by reordering object depths to mitigate perspective occlusion conflicts, and Geo-CS re-crops local patches for simultaneous scaling of distance and scale to unify appearance and annotation. These operations ameliorate the problem of class imbalance in the monocular paradigm by increasing the quantity and distribution of geometrically consistent samples. Experiments demonstrate that our geometric-level augmentation operators effectively improve robustness and performance in the KITTI and Waymo monocular 3D detection benchmarks.
引用
收藏
相关论文
共 50 条
[21]   DetailRecon: Focusing on Detailed Regions for Online Monocular 3D Reconstruction [J].
Chu, Fupeng ;
Cong, Yang ;
Wang, Yanmei ;
Chen, Ronghan .
IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 :3266-3278
[22]   Reconstruction of Personalized 3D Face Rigs from Monocular Video [J].
Garrido, Pablo ;
Zollhoefer, Michael ;
Casas, Dan ;
Valgaerts, Levi ;
Varanasi, Kiran ;
Perez, Patrick ;
Theobalt, Christian .
ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (03)
[23]   Bridging 2D and 3D Object Detection: Advances in Occlusion Handling through Depth Estimation [J].
Ouardirhi, Zainab ;
Zbakh, Mostapha ;
Mahmoudi, Sidi Ahmed .
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2025, 143 (03) :2509-2571
[24]   Bridging 2D and 3D Object Detection: Advances in Occlusion Handling through Depth Estimation [J].
Ouardirhi, Zainab ;
Zbakh, Mostapha ;
Mahmoudi, Sidi Ahmed .
CMES - Computer Modeling in Engineering and Sciences, 2025, 143 (03) :2509-2571
[25]   Monocular trajectory intersection method for 3D motion measurement of a point target [J].
YU QiFeng SHANG YangZHOU JianZHANG XiaoHuLI LiChun College of Aerospace and Material EngineeringNational University of Defense TechnologyChangsha China .
Science in China(Series E:Technological Sciences), 2009, (12) :3454-3463
[26]   Monocular 3D Pedestrian Localization Fusing with Bird's Eye View [J].
Chen, Hao ;
Zhang, Shanxin ;
Yuan, Hui ;
Yang, Xinghai ;
Zhang, Huaxiang ;
Sun, Jiande .
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[27]   Monocular 3D reconstruction of sail flying shape using passive markers [J].
Maciel, Luiz ;
Marroquim, Ricardo ;
Vieira, Marcelo ;
Ribeiro, Kevyn ;
Alho, Alexandre .
MACHINE VISION AND APPLICATIONS, 2021, 32 (01)
[28]   Personalized Graph Generation for Monocular 3D Human Pose and Shape Estimation [J].
Hu, Junxing ;
Zhang, Hongwen ;
Wang, Yunlong ;
Ren, Min ;
Sun, Zhenan .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) :2399-2413
[29]   Monocular trajectory intersection method for 3D motion measurement of a point target [J].
QiFeng Yu ;
Yang Shang ;
Jian Zhou ;
XiaoHu Zhang ;
LiChun Li .
Science in China Series E: Technological Sciences, 2009, 52 :3454-3463
[30]   Monocular 3D reconstruction of sail flying shape using passive markers [J].
Luiz Maciel ;
Ricardo Marroquim ;
Marcelo Vieira ;
Kevyn Ribeiro ;
Alexandre Alho .
Machine Vision and Applications, 2021, 32