Semantic Shape and Trajectory Reconstruction for Monocular Cooperative 3D Object Detection

被引：0

作者：

Cserni, Marton ^{[1
]}

Rovid, Andras ^{[1
]}

机构：

[1] Budapest Univ Technol & Econ BME, Fac Transportat Engn & Vehicle Engn, Dept Automot Technol, H-1111 Budapest, Hungary

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Semantics; Three-dimensional displays; Image reconstruction; Solid modeling; Trajectory; Pose estimation; Accuracy; Cameras; Computational modeling; Autonomous driving; shape aware monocular 3D object detection; trajectory reconstruction; semantic keypoints; cooperative perception;

D O I：

10.1109/ACCESS.2024.3484672

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Currently the state-of-the-art monocular 3D object detectors use machine learning to estimate the 6DOF pose and shape of vehicles. This requires large amounts of precisely annotated 3D data for the training process and significant computing power for inference. Alternatively, there exist methods, which attempt to reconstruct target vehicle shapes and scales using projective geometry and classically detected feature points such as SURF and ORB. These methods use specific camera motion or geometrical constraints which cannot always be assumed. The resulting model is an unstructured point cloud which contains no semantic information, making its utility inconvenient in a distributed perception system. In this study, the applicability of semantic keypoints for vehicle shape and trajectory estimation is explored. A novel method is presented, which is capable reconstructing the semantic shape and trajectory of the target vehicle from a sequence of images with state-of-the art accuracy. The resulting semantic vertex model is then used for monocular, single frame 6DOF pose estimation with high accuracy. Building on this, a cooperative perception framework is also introduced. The algorithm is tested in both in-vehicle and infrastructure mounted mono-camera sensor setups. In addition to achieving state of the art depth accuracy in vehicle trajectory reconstruction on the Argoverse dataset, our method outperforms the state of the art shape-aware deep learning method in pose estimation in a cooperative perception scenario both in simulation and in real-world experiments.

引用

页码：167153 / 167167

页数：15

共 50 条

[41] Enhancing Monocular 3-D Object Detection Through Data Augmentation Strategies
Jia, Yisong
Wang, Jue
Pan, Huihui
Sun, Weichao
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 11
[42] M3DGAF: Monocular 3D Object Detection With Geometric Appearance Awareness and Feature Fusion
Chen, Mu
Liu, Pengfei
Zhao, Huaici
IEEE SENSORS JOURNAL, 2023, 23 (11) : 11232 - 11240
[43] 3-D Semantic Terrain Reconstruction of Monocular Close-Up Images of Martian Terrains
Tian, Pengzhi
Yao, Meibao
Xiao, Xueming
Zheng, Bo
Cao, Tao
Xi, Yurong
Liu, Haiqiang
Cui, Hutao
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 16
[44] BEVFusion With Dual Hard Instance Probing for Multimodal 3D Object Detection
Kim, Taeho
Kim, Joohee
IEEE ACCESS, 2025, 13 : 25546 - 25556
[45] Adaptive Feature Fusion Based Cooperative 3D Object Detection for Autonomous Driving
Wang, Junyong
Zeng, Yuan
Gong, Yi
2022 3RD INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE (ICTC 2022), 2022, : 103 - 107
[46] CoBEV: Elevating Roadside 3D Object Detection With Depth and Height Complementarity
Shi, Hao
Pang, Chengshan
Zhang, Jiaming
Yang, Kailun
Wu, Yuhao
Ni, Huajian
Lin, Yining
Stiefelhagen, Rainer
Wang, Kaiwei
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5424 - 5439
[47] Whole Stomach 3D Reconstruction and Frame Localization From Monocular Endoscope Video
Widya, Aji Resindra
Monno, Yusuke
Okutomi, Masatoshi
Suzuki, Sho
Gotoda, Takuji
Miki, Kenji
IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE, 2019, 7
[48] Leveraging front and side cues for occlusion handling in monocular 3D object detection
Yuying Song
Zecheng Li
Jingxuan Wu
Chunyi Song
Zhiwei Xu
The Visual Computer, 2024, 40 : 1757 - 1773
[49] MonoCAPE: Monocular 3D object detection with coordinate-aware position embeddings
Chen, Wenyu
Chen, Mu
Fang, Jian
Zhao, Huaici
Wang, Guogang
COMPUTERS & ELECTRICAL ENGINEERING, 2024, 120
[50] Leveraging front and side cues for occlusion handling in monocular 3D object detection
Song, Yuying
Li, Zecheng
Wu, Jingxuan
Song, Chunyi
Xu, Zhiwei
VISUAL COMPUTER, 2024, 40 (03) : 1757 - 1773

← 1 2 3 4 5 →