Do Keypoints Contain Crucial Information? Mining Keypoint Information to Enhance Cross-View Geo-Localization

被引:0
作者
Liang, Yanchao [1 ]
Wu, Xiangqian [1 ]
机构
[1] Harbin Inst Technol, Fac Comp, Harbin, Peoples R China
来源
2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024 | 2024年
基金
黑龙江省自然科学基金;
关键词
Geo-localization; Attention; Keypoint; Representation Learning;
D O I
10.1109/ICME57554.2024.10688249
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to drastic view changes and different capturing times between images, extracting discriminative image-level features for cross-view geo-localization is challenging. Although recent works have achieved outstanding progress on cross-view geo-localization, the fine-grained information in images has not been fully explored in extracting image-level features. Inspired by the process of the human visual system to distinguish similar targets and the process of keypoint detection and description, we propose a framework called UDPA-Net, which guides the model to mine more favorable information for cross-view geolocalization by detecting keypoints. Specifically, we design a Unit Dot Product Attention Module (UDPAM) to discover remarkable keypoints automatically and guide the model to pay more attention to the salient regions. UDPA-Net introduces few parameters but yields significant performance gains and can be easily integrated into different networks. Our code is available at https://gitee.com/KerasLyc/UDPA-Net.
引用
收藏
页数:6
相关论文
共 19 条
[1]   Learning to Match Aerial Images with Deep Attentive Architectures [J].
Altwaijry, Hani ;
Trulls, Eduard ;
Hays, James ;
Fua, Pascal ;
Belongie, Serge .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3539-3547
[2]  
Chechik G, 2010, J MACH LEARN RES, V11, P1109
[3]   AutoAugment: Learning Augmentation Strategies from Data [J].
Cubuk, Ekin D. ;
Zoph, Barret ;
Mane, Dandelion ;
Vasudevan, Vijay ;
Le, Quoc V. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :113-123
[4]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[5]   CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization [J].
Hu, Sixing ;
Feng, Mengdan ;
Nguyen, Rang M. H. ;
Lee, Gim Hee .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7258-7267
[6]  
Ju HX, 2020, RSC DETECT SCI SER, V16, P47
[7]   Mechanisms of visual attention in the human cortex [J].
Kastner, S ;
Ungerleider, LG .
ANNUAL REVIEW OF NEUROSCIENCE, 2000, 23 :315-341
[8]   Reservoir Computing Transformer for Image-Text Retrieval [J].
Li, Wenrui ;
Ma, Zhengyu ;
Deng, Liang-Jian ;
Wang, Penghong ;
Shi, Jinqiao ;
Fan, Xiaopeng .
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, :5605-5613
[9]   Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization [J].
Lin, Jinliang ;
Zheng, Zhedong ;
Zhong, Zhun ;
Luo, Zhiming ;
Li, Shaozi ;
Yang, Yi ;
Sebe, Nicu .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :3780-3792
[10]  
Lin TY, 2015, PROC CVPR IEEE, P5007, DOI 10.1109/CVPR.2015.7299135