GeoViewMatch: A Multi-Scale Feature-Matching Network for Cross-View Geo-Localization Using Swin-Transformer and Contrastive Learning

被引:0
作者
Zhang, Wenhui [1 ]
Zhong, Zhinong [1 ]
Chen, Hao [1 ,2 ]
Jing, Ning [1 ,2 ]
机构
[1] Natl Univ Def Technol, Coll Elect Sci & Technol, Changsha 410073, Peoples R China
[2] Minist Nat Resources, Key Lab Nat Resources Monitoring & Supervis Southe, Changsha 410073, Peoples R China
关键词
cross-view geo-localization; contrastive learning; multi-scale feature extraction; remote sensing;
D O I
10.3390/rs16040678
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Cross-view geo-localization aims to locate street-view images by matching them with a collection of GPS-tagged remote sensing (RS) images. Due to the significant viewpoint and appearance differences between street-view images and RS images, this task is highly challenging. While deep learning-based methods have shown their dominance in the cross-view geo-localization task, existing models have difficulties in extracting comprehensive meaningful features from both domains of images. This limitation results in not establishing accurate and robust dependencies between street-view images and the corresponding RS images. To address the aforementioned issues, this paper proposes a novel and lightweight neural network for cross-view geo-localization. Firstly, in order to capture more diverse information, we propose a module for extracting multi-scale features from images. Secondly, we introduce contrastive learning and design a contrastive loss to further enhance the robustness in extracting and aligning meaningful multi-scale features. Finally, we conduct comprehensive experiments on two open benchmarks. The experimental results have demonstrated the superiority of the proposed method over the state-of-the-art methods.
引用
收藏
页数:19
相关论文
共 50 条
  • [11] CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization
    Hu, Sixing
    Feng, Mengdan
    Nguyen, Rang M. H.
    Lee, Gim Hee
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7258 - 7267
  • [12] Kwon J, 2021, INT C MACHINE LEARNI, V139
  • [13] RemainNet: Explore Road Extraction from Remote Sensing Image Using Mask Image Modeling
    Li, Zhenghong
    Chen, Hao
    Jing, Ning
    Li, Jun
    [J]. REMOTE SENSING, 2023, 15 (17)
  • [14] SwinIR: Image Restoration Using Swin Transformer
    Liang, Jingyun
    Cao, Jiezhang
    Sun, Guolei
    Zhang, Kai
    Van Gool, Luc
    Timofte, Radu
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 1833 - 1844
  • [15] Cross-View Image Geolocalization
    Lin, Tsung-Yi
    Belongie, Serge
    Hays, James
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 891 - 898
  • [16] Lending Orientation to Neural Networks for Cross-view Geo-localization
    Liu, Liu
    Li, Hongdong
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5607 - 5616
  • [17] Swin Transformer V2: Scaling Up Capacity and Resolution
    Liu, Ze
    Hu, Han
    Lin, Yutong
    Yao, Zhuliang
    Xie, Zhenda
    Wei, Yixuan
    Ning, Jia
    Cao, Yue
    Zhang, Zheng
    Dong, Li
    Wei, Furu
    Guo, Baining
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11999 - 12009
  • [18] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
    Liu, Ze
    Lin, Yutong
    Cao, Yue
    Hu, Han
    Wei, Yixuan
    Zhang, Zheng
    Lin, Stephen
    Guo, Baining
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002
  • [19] SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection
    Liu, Zhengyi
    Tan, Yacheng
    He, Qian
    Xiao, Yun
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (07) : 4486 - 4497
  • [20] Loshchilov I., 2016, arXiv