Image-to-Point Registration via Cross-Modality Correspondence Retrieval

被引：0

作者：

Bie, Lin ^{[1
]}

Li, Siqi ^{[1
]}

Cheng, Kai ^{[2
]}

机构：

[1] Tsinghua Univ, Sch Software, Beijing, Peoples R China

[2] Army Engn Univ, Command Control Coll, Nanjing, Peoples R China

来源：

PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024 | 2024年

关键词：

Image-to-Point Cloud registration; cross-modality correspondence retrieval; frustum point retrieval; combined correspondence retrieval; virtual point cloud;

D O I：

10.1145/3652583.3658074

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image-to-Point Cloud registration between 2D images and 3D LiDAR point clouds is a significant task in computer vision. The traditional registration pipeline first establishes correspondences between images and point clouds and then performs pose estimation based on the generated matches. However, 2D-3D correspondences are inherently difficult to be established due to the large modality gap between images and LiDAR point clouds. To this end, we build a bridge to alleviate the 2D-3D modality gap, which aligns LiDAR point clouds to the virtual points generated by images. In this way, the modality gap can be alleviated to the domain gap of different types of point clouds, i.e. original point clouds and virtual point clouds. Concretely, our framework conducts feature fusion from the LiDAR and virtual point cloud by utilizing the Transformer layer. To relieve the domain gap, a frustum points retrieval module and a combined correspondences retrieval module are proposed based on the consistency of the feature and position descriptor to select the correct correspondences among the candidates, which are generated from the simultaneous retrieval of features and position descriptors. In the implementation procedure, we design a frustum retrieval loss and a combined correspondence retrieval loss for cross-modality correspondence retrieval. Experimental results and comparison with state-of-the-art Image-to-Point Cloud methods on KITTI and nuScenes datasets demonstrate our proposed method has achieved superior performance.

引用

页码：266 / 274

页数：9

共 50 条

[31] Implicit relative attribute enabled cross-modality hashing for face image-video retrieval
Peng Dai
Xue Wang
Weihang Zhang
Pengbo Zhang
Wei You
Multimedia Tools and Applications, 2018, 77 : 23547 - 23577
[32] CurrI2P: inter- and intra-modality similarity curriculum learning for image-to-point cloud registration
Lin, Liwei
Lin, Chunyu
Nie, Lang
Huang, Shujuan
Zhao, Yao
VISUAL COMPUTER, 2025,
[33] Sequential Discrete Hashing for Scalable Cross-Modality Similarity Retrieval
Liu, Li
Lin, Zijia
Shao, Ling
Shen, Fumin
Ding, Guiguang
Han, Jungong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (01) : 107 - 118
[34] FF-LOGO: Cross-Modality Point Cloud Registration with Feature Filtering and Local to Global Optimization
Ma, Nan
Wang, Mohan
Han, Yiheng
Liu, Yong-Jin
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 744 - 750
[35] Interactive Image Segmentation with Cross-Modality Vision Transformers
Li, Kun
Vosselman, George
Yang, Michael Ying
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 762 - 772
[36] Learning cross-modality features for image caption generation
Zeng, Chao
Kwong, Sam
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (07) : 2059 - 2070
[37] Cross-Modality Contrastive Learning for Hyperspectral Image Classification
Hang, Renlong
Qian, Xuwei
Liu, Qingshan
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[38] Cross-modality Attention Method for Medical Image Enhancement
Hu, Zebin
Liu, Hao
Li, Zhendong
Yu, Zekuan
PATTERN RECOGNITION AND COMPUTER VISION,, PT III, 2021, 13021 : 411 - 423
[39] Detail-Enhanced Cross-Modality Face Synthesis via Guided Image Filtering
Dang, Yunqi
Li, Feng
Li, Zhaoxin
Zuo, Wangmeng
COMPUTER VISION, CCCV 2015, PT I, 2015, 546 : 200 - 209
[40] Cross-Modality Bridging and Knowledge Transferring for Image Understanding
Yan, Chenggang
Li, Liang
Zhang, Chunjie
Liu, Bingtao
Zhang, Yongdong
Dai, Qionghai
IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (10) : 2675 - 2685

← 1 2 3 4 5 →