GeoViewMatch: A Multi-Scale Feature-Matching Network for Cross-View Geo-Localization Using Swin-Transformer and Contrastive Learning

被引：0

作者：

Zhang, Wenhui ^{[1
]}

Zhong, Zhinong ^{[1
]}

Chen, Hao ^{[1
,2
]}

Jing, Ning ^{[1
,2
]}

机构：

[1] Natl Univ Def Technol, Coll Elect Sci & Technol, Changsha 410073, Peoples R China

[2] Minist Nat Resources, Key Lab Nat Resources Monitoring & Supervis Southe, Changsha 410073, Peoples R China

来源：

REMOTE SENSING | 2024年 / 16卷 / 04期

关键词：

cross-view geo-localization; contrastive learning; multi-scale feature extraction; remote sensing;

D O I：

10.3390/rs16040678

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Cross-view geo-localization aims to locate street-view images by matching them with a collection of GPS-tagged remote sensing (RS) images. Due to the significant viewpoint and appearance differences between street-view images and RS images, this task is highly challenging. While deep learning-based methods have shown their dominance in the cross-view geo-localization task, existing models have difficulties in extracting comprehensive meaningful features from both domains of images. This limitation results in not establishing accurate and robust dependencies between street-view images and the corresponding RS images. To address the aforementioned issues, this paper proposes a novel and lightweight neural network for cross-view geo-localization. Firstly, in order to capture more diverse information, we propose a module for extracting multi-scale features from images. Secondly, we introduce contrastive learning and design a contrastive loss to further enhance the robustness in extracting and aligning meaningful multi-scale features. Finally, we conduct comprehensive experiments on two open benchmarks. The experimental results have demonstrated the superiority of the proposed method over the state-of-the-art methods.

引用

页数：19

共 50 条

[11] CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization
Hu, Sixing
Feng, Mengdan
Nguyen, Rang M. H.
Lee, Gim Hee
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7258 - 7267
[12] Kwon J, 2021, INT C MACHINE LEARNI, V139
[13] RemainNet: Explore Road Extraction from Remote Sensing Image Using Mask Image Modeling
Li, Zhenghong
Chen, Hao
Jing, Ning
Li, Jun
[J]. REMOTE SENSING, 2023, 15 (17)
[14] SwinIR: Image Restoration Using Swin Transformer
Liang, Jingyun
Cao, Jiezhang
Sun, Guolei
Zhang, Kai
Van Gool, Luc
Timofte, Radu
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 1833 - 1844
[15] Cross-View Image Geolocalization
Lin, Tsung-Yi
Belongie, Serge
Hays, James
[J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 891 - 898
[16] Lending Orientation to Neural Networks for Cross-view Geo-localization
Liu, Liu
Li, Hongdong
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5607 - 5616
[17] Swin Transformer V2: Scaling Up Capacity and Resolution
Liu, Ze
Hu, Han
Lin, Yutong
Yao, Zhuliang
Xie, Zhenda
Wei, Yixuan
Ning, Jia
Cao, Yue
Zhang, Zheng
Dong, Li
Wei, Furu
Guo, Baining
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11999 - 12009
[18] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Liu, Ze
Lin, Yutong
Cao, Yue
Hu, Han
Wei, Yixuan
Zhang, Zheng
Lin, Stephen
Guo, Baining
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002
[19] SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection
Liu, Zhengyi
Tan, Yacheng
He, Qian
Xiao, Yun
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (07) : 4486 - 4497
[20] Loshchilov I., 2016, arXiv

← 1 2 3 4 5 →