LNIFT: Locally Normalized Image for Rotation Invariant Multimodal Feature Matching

被引:86
作者
Li, Jiayuan [1 ]
Xu, Wangyi [1 ]
Shi, Pengcheng [2 ]
Zhang, Yongjun [1 ]
Hu, Qingwu [1 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430072, Peoples R China
[2] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2022年 / 60卷
基金
中国国家自然科学基金;
关键词
Depth-optical; feature matching; infrared-optical; local descriptor; multimodal image matching; synthetic aperture radar (SAR)-optical; AUTOMATIC REGISTRATION; FRAMEWORK; SAR; MAXIMIZATION;
D O I
10.1109/TGRS.2022.3165940
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Severe nonlinear radiation distortion (NRD) is the bottleneck problem of multimodal image matching. Although many efforts have been made in the past few years, such as the radiation-variation insensitive feature transform (RIFT) and the histogram of orientated phase congruency (HOPC), almost all these methods are based on frequency-domain information that suffers from high computational overhead and memory footprint. In this article, we propose a simple but very effective multimodal feature matching algorithm in the spatial domain, called locally normalized image feature transform (LNIFT). We first propose a local normalization filter to convert original images into normalized images for feature detection and description, which largely reduces the NRD between multimodal images. We demonstrate that normalized matching pairs have a much larger correlation coefficient than the original ones. We then detect oriented FAST and rotated brief (ORB) keypoints on the normalized images and use an adaptive nonmaximal suppression (ANMS) strategy to improve the distribution of keypoints. We also describe keypoints on the normalized images based on a histogram of oriented gradient (HOG), such as a descriptor. Our LNIFT achieves rotation invariance the same as ORB without any additional computational overhead. Thus, LNIFT can be performed in near real-time on images with 1024 x 1024 pixels (only costs 0.32 s with 2500 keypoints). Four multimodal image datasets with a total of 4000 matching pairs are used for comprehensive evaluations, including synthetic aperture radar (SAR)-optical, infrared-optical, and depth-optical datasets. Experimental results show that LNIFT is far superior to RIFT in terms of efficiency (0.49 s versus 47.8 s on a 1024 x 1024 image), success rate (99.9% versus 79.85%), and number of correct matches (309 versus 119). The source code and datasets will be publicly available at https://ljy-rs.githublo/web.
引用
收藏
页数:14
相关论文
共 53 条
[31]   ASIFT: A New Framework for Fully Affine Invariant Image Comparison [J].
Morel, Jean-Michel ;
Yu, Guoshen .
SIAM JOURNAL ON IMAGING SCIENCES, 2009, 2 (02) :438-469
[32]  
Ofverstedt J., ARXIV210614699, V2021
[33]   Nonrigid Registration of Ultrasound and MRI Using Contextual Conditioned Mutual Information [J].
Rivaz, Hassan ;
Karimaghaloo, Zahra ;
Fonov, Vladimir S. ;
Collins, D. Louis .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2014, 33 (03) :708-725
[34]   Machine learning for high-speed corner detection [J].
Rosten, Edward ;
Drummond, Tom .
COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 :430-443
[35]  
Rublee E, 2011, IEEE I CONF COMP VIS, P2564, DOI 10.1109/ICCV.2011.6126544
[36]   Illumination-Robust remote sensing image matching based on oriented self-similarity [J].
Sedaghat, Amin ;
Mohammadi, Nazila .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2019, 153 :21-35
[37]  
Shechtman E, 2007, PROC CVPR IEEE, P1744
[38]   An overlap invariant entropy measure of 3D medical image alignment [J].
Studholme, C ;
Hill, DLG ;
Hawkes, DJ .
PATTERN RECOGNITION, 1999, 32 (01) :71-86
[39]   Efficient Discrimination and Localization of Multimodal Remote Sensing Images Using CNN-Based Prediction of Localization Uncertainty [J].
Uss, Mykhail ;
Vozel, Benoit ;
Lukin, Vladimir ;
Chehdi, Kacem .
REMOTE SENSING, 2020, 12 (04)
[40]   Alignment by maximization of mutual information [J].
Viola, P ;
Wells, WM .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 1997, 24 (02) :137-154