Semantic Shape and Trajectory Reconstruction for Monocular Cooperative 3D Object Detection

被引:0
作者
Cserni, Marton [1 ]
Rovid, Andras [1 ]
机构
[1] Budapest Univ Technol & Econ BME, Fac Transportat Engn & Vehicle Engn, Dept Automot Technol, H-1111 Budapest, Hungary
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Semantics; Three-dimensional displays; Image reconstruction; Solid modeling; Trajectory; Pose estimation; Accuracy; Cameras; Computational modeling; Autonomous driving; shape aware monocular 3D object detection; trajectory reconstruction; semantic keypoints; cooperative perception;
D O I
10.1109/ACCESS.2024.3484672
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Currently the state-of-the-art monocular 3D object detectors use machine learning to estimate the 6DOF pose and shape of vehicles. This requires large amounts of precisely annotated 3D data for the training process and significant computing power for inference. Alternatively, there exist methods, which attempt to reconstruct target vehicle shapes and scales using projective geometry and classically detected feature points such as SURF and ORB. These methods use specific camera motion or geometrical constraints which cannot always be assumed. The resulting model is an unstructured point cloud which contains no semantic information, making its utility inconvenient in a distributed perception system. In this study, the applicability of semantic keypoints for vehicle shape and trajectory estimation is explored. A novel method is presented, which is capable reconstructing the semantic shape and trajectory of the target vehicle from a sequence of images with state-of-the art accuracy. The resulting semantic vertex model is then used for monocular, single frame 6DOF pose estimation with high accuracy. Building on this, a cooperative perception framework is also introduced. The algorithm is tested in both in-vehicle and infrastructure mounted mono-camera sensor setups. In addition to achieving state of the art depth accuracy in vehicle trajectory reconstruction on the Argoverse dataset, our method outperforms the state of the art shape-aware deep learning method in pose estimation in a cooperative perception scenario both in simulation and in real-world experiments.
引用
收藏
页码:167153 / 167167
页数:15
相关论文
共 50 条
  • [41] Enhancing Monocular 3-D Object Detection Through Data Augmentation Strategies
    Jia, Yisong
    Wang, Jue
    Pan, Huihui
    Sun, Weichao
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 11
  • [42] M3DGAF: Monocular 3D Object Detection With Geometric Appearance Awareness and Feature Fusion
    Chen, Mu
    Liu, Pengfei
    Zhao, Huaici
    IEEE SENSORS JOURNAL, 2023, 23 (11) : 11232 - 11240
  • [43] 3-D Semantic Terrain Reconstruction of Monocular Close-Up Images of Martian Terrains
    Tian, Pengzhi
    Yao, Meibao
    Xiao, Xueming
    Zheng, Bo
    Cao, Tao
    Xi, Yurong
    Liu, Haiqiang
    Cui, Hutao
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 16
  • [44] BEVFusion With Dual Hard Instance Probing for Multimodal 3D Object Detection
    Kim, Taeho
    Kim, Joohee
    IEEE ACCESS, 2025, 13 : 25546 - 25556
  • [45] Adaptive Feature Fusion Based Cooperative 3D Object Detection for Autonomous Driving
    Wang, Junyong
    Zeng, Yuan
    Gong, Yi
    2022 3RD INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE (ICTC 2022), 2022, : 103 - 107
  • [46] CoBEV: Elevating Roadside 3D Object Detection With Depth and Height Complementarity
    Shi, Hao
    Pang, Chengshan
    Zhang, Jiaming
    Yang, Kailun
    Wu, Yuhao
    Ni, Huajian
    Lin, Yining
    Stiefelhagen, Rainer
    Wang, Kaiwei
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5424 - 5439
  • [47] Whole Stomach 3D Reconstruction and Frame Localization From Monocular Endoscope Video
    Widya, Aji Resindra
    Monno, Yusuke
    Okutomi, Masatoshi
    Suzuki, Sho
    Gotoda, Takuji
    Miki, Kenji
    IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE, 2019, 7
  • [48] Leveraging front and side cues for occlusion handling in monocular 3D object detection
    Yuying Song
    Zecheng Li
    Jingxuan Wu
    Chunyi Song
    Zhiwei Xu
    The Visual Computer, 2024, 40 : 1757 - 1773
  • [49] MonoCAPE: Monocular 3D object detection with coordinate-aware position embeddings
    Chen, Wenyu
    Chen, Mu
    Fang, Jian
    Zhao, Huaici
    Wang, Guogang
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 120
  • [50] Leveraging front and side cues for occlusion handling in monocular 3D object detection
    Song, Yuying
    Li, Zecheng
    Wu, Jingxuan
    Song, Chunyi
    Xu, Zhiwei
    VISUAL COMPUTER, 2024, 40 (03) : 1757 - 1773