CrossHomo: Cross-Modality and Cross-Resolution Homography Estimation

被引:5
作者
Deng, Xin [2 ]
Liu, Enpeng [2 ]
Gao, Chao [1 ]
Li, Shengxi [2 ]
Gu, Shuhang [3 ]
Xu, Mai [2 ]
机构
[1] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100191, Peoples R China
[2] Beihang Univ, Sch Elect & Informat Engn, Beijing 100191, Peoples R China
[3] Univ Elect Sci & Technol, Sch Comp Sci & Engn, Chengdu 610056, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Estimation; Image resolution; Feature extraction; Superresolution; Deep learning; Task analysis; Spatial resolution; Homography estimation; multi-modal image registration; UNSUPERVISED DEEP HOMOGRAPHY; IMAGE;
D O I
10.1109/TPAMI.2024.3366234
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-modal homography estimation aims to spatially align the images from different modalities, which is quite challenging since both the image content and resolution are variant across modalities. In this paper, we introduce a novel framework namely CrossHomo to tackle this challenging problem. Our framework is motivated by two interesting findings which demonstrate the mutual benefits between image super-resolution and homography estimation. Based on these findings, we design a flexible multi-level homography estimation network to align the multi-modal images in a coarse-to-fine manner. Each level is composed of a multi-modal image super-resolution (MISR) module to shrink the resolution gap between different modalities, followed by a multi-modal homography estimation (MHE) module to predict the homography matrix. To the best of our knowledge, CrossHomo is the first attempt to address the homography estimation problem with both modality and resolution discrepancy. Extensive experimental results show that our CrossHomo can achieve high registration accuracy on various multi-modal datasets with different resolution gaps. In addition, the network has high efficiency in terms of both model complexity and running speed.
引用
收藏
页码:5725 / 5742
页数:18
相关论文
共 52 条
[1]   Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation [J].
Arar, Moab ;
Ginger, Yiftach ;
Danon, Dov ;
Bermano, Amit H. ;
Cohen-Or, Daniel .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :13407-13416
[2]   MAGSAC: Marginalizing Sample Consensus [J].
Barath, Daniel ;
Matas, Jiri ;
Noskova, Jana .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10189-10197
[3]   Speeded-Up Robust Features (SURF) [J].
Bay, Herbert ;
Ess, Andreas ;
Tuytelaars, Tinne ;
Van Gool, Luc .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) :346-359
[4]  
Brown M, 2011, PROC CVPR IEEE, P177, DOI 10.1109/CVPR.2011.5995637
[5]   Reference-Based Image Super-Resolution with Deformable Attention Transformer [J].
Cao, Jiezhang ;
Liang, Jingyun ;
Zhang, Kai ;
Li, Yawei ;
Zhang, Yulun ;
Wang, Wenguan ;
Van Gool, Luc .
COMPUTER VISION - ECCV 2022, PT XVIII, 2022, 13678 :325-342
[6]   Iterative Deep Homography Estimation [J].
Cao, Si-Yuan ;
Hu, Jianxin ;
Sheng, Zehua ;
Shen, Hui-Liang .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :1869-1878
[7]   CLKN: Cascaded Lucas-Kanade Networks for Image Alignment [J].
Chang, Che-Han ;
Chou, Chun-Nan ;
Chang, Edward Y. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3777-3785
[8]   Second-order Attention Network for Single Image Super-Resolution [J].
Dai, Tao ;
Cai, Jianrui ;
Zhang, Yongbing ;
Xia, Shu-Tao ;
Zhang, Lei .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11057-11066
[9]   Interpretable Multi-Modal Image Registration Network Based on Disentangled Convolutional Sparse Coding [J].
Deng, Xin ;
Liu, Enpeng ;
Li, Shengxi ;
Duan, Yiping ;
Xu, Mai .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 :1078-1091
[10]  
DeTone D, 2016, Arxiv, DOI arXiv:1606.03798