Lightweight LiDAR-Camera Alignment With Homogeneous Local-Global Aware Representation

被引:1
作者
Zhu, Angfan [1 ]
Xiao, Yang [2 ,3 ]
Liu, Chengxin [2 ,3 ]
Tan, Mingkui [2 ,4 ]
Cao, Zhiguo [2 ,3 ]
机构
[1] China Acad Engn Phys, Inst Comp Applicat, Beijing 100873, Peoples R China
[2] Huazhong Univ Sci & Technol, Key Lab Image Proc & Intelligent Control, Minist Educ, Wuhan 430074, Peoples R China
[3] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[4] South China Univ Technol, Sch Software Engn, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
LiDAR-camera alignment; deep learning; homogeneous multi-modality representation; local-global spatial awareness; transformer; SELF-CALIBRATION;
D O I
10.1109/TITS.2024.3409397
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
In this paper, a novel LiDAR-Camera Alignment (LCA) method using homogeneous local-global spatial aware representation is proposed. Compared with the state-of-the-art methods (e.g., LCCNet), our proposition holds 2 main superiorities. First, homogeneous multi-modality representation learned with a uniform CNN model is applied along the iterative prediction stages, instead of the state-of-the-art heterogeneous counterparts extracted from the separated modality-wise CNN models within each stage. In this way, the model size can be significantly decreased (e.g., 12.39M (ours) vs. 333.75M (LCCNet)). Meanwhile, within our proposition the interaction between LiDAR and camera data is built during feature learning to better exploit the descriptive clues, which has not been well concerned by the existing approaches. Secondly, we propose to equip the learned LCA representation with local-global spatial aware capacity via encoding CNN's local convolutional features with Transformer's non-local self-attention manner. Accordingly, the local fine details and global spatial context can be jointly captured by the encoded local features. And, they will be jointly used for LCA. On the other hand, the existing methods generally choose to reveal the global spatial property via intuitively concatenating the local features. Additionally at the initial LCA stage, LiDAR is roughly aligned with camera by our pre-alignment method, according to the point distribution characteristics of its 2D projection version with the initial extrinsic parameters. Although its structure is simple, it can essentially alleviate LCA's difficulty for the consequent stages. To better optimize LCA, a novel loss function that builds the correlation between translation and rotation loss items is also proposed. The experiments on KITTI data verifies the superiority of our proposition both on effectiveness and efficiency. The source code will be released at https://github.com/Zaf233/Light-weight-LCA upon acceptance.
引用
收藏
页码:15922 / 15933
页数:12
相关论文
共 37 条
[1]  
An Nguyen Duy, 2022, 2022 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)., P141, DOI 10.1109/ICAIIC54071.2022.9722671
[2]   Geometric calibration for LiDAR-camera system fusing 3D-2D and 3D-3D point correspondences [J].
An, Pei ;
Ma, Tao ;
Yu, Kun ;
Fang, Bin ;
Zhang, Jun ;
Fu, Wenxing ;
Ma, Jie .
OPTICS EXPRESS, 2020, 28 (02) :2122-2141
[3]  
Cattaneo D, 2019, IEEE INT C INTELL TR, P1283, DOI [10.1109/itsc.2019.8917470, 10.1109/ITSC.2019.8917470]
[4]   CIR-Net: Cross-Modality Interaction and Refinement for RGB-D Salient Object Detection [J].
Cong, Runmin ;
Lin, Qinwei ;
Zhang, Chen ;
Li, Chongyi ;
Cao, Xiaochun ;
Huang, Qingming ;
Zhao, Yao .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :6800-6815
[5]  
Feng MD, 2019, IEEE INT CONF ROBOT, P4790, DOI [10.1109/ICRA.2019.8794415, 10.1109/icra.2019.8794415]
[6]   Feature-Aware Adaptation and Density Alignment for Crowd Counting in Video Surveillance [J].
Gao, Junyu ;
Yuan, Yuan ;
Wang, Qi .
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (10) :4822-4833
[7]  
Geiger A., 2012, C COMP VIS PATT REC
[8]  
HE KM, 2016, PROC CVPR IEEE, P770, DOI [DOI 10.1109/CVPR.2016.90, 10.1109/CVPR.2016.90]
[9]  
Ishikawa R, 2018, IEEE INT C INT ROBOT, P7342, DOI 10.1109/IROS.2018.8593360
[10]  
Iyer G, 2018, IEEE INT C INT ROBOT, P1110, DOI 10.1109/IROS.2018.8593693