Do Keypoints Contain Crucial Information? Mining Keypoint Information to Enhance Cross-View Geo-Localization

被引：0

作者：

Liang, Yanchao ^{[1
]}

Wu, Xiangqian ^{[1
]}

机构：

[1] Harbin Inst Technol, Fac Comp, Harbin, Peoples R China

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024 | 2024年

基金：

黑龙江省自然科学基金;

关键词：

Geo-localization; Attention; Keypoint; Representation Learning;

D O I：

10.1109/ICME57554.2024.10688249

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to drastic view changes and different capturing times between images, extracting discriminative image-level features for cross-view geo-localization is challenging. Although recent works have achieved outstanding progress on cross-view geo-localization, the fine-grained information in images has not been fully explored in extracting image-level features. Inspired by the process of the human visual system to distinguish similar targets and the process of keypoint detection and description, we propose a framework called UDPA-Net, which guides the model to mine more favorable information for cross-view geolocalization by detecting keypoints. Specifically, we design a Unit Dot Product Attention Module (UDPAM) to discover remarkable keypoints automatically and guide the model to pay more attention to the salient regions. UDPA-Net introduces few parameters but yields significant performance gains and can be easily integrated into different networks. Our code is available at https://gitee.com/KerasLyc/UDPA-Net.

引用

页数：6

共 19 条

[1] Learning to Match Aerial Images with Deep Attentive Architectures [J].

Altwaijry, Hani ;

Trulls, Eduard ;

Hays, James ;

Fua, Pascal ;

Belongie, Serge .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3539-3547

[2]

Chechik G, 2010, J MACH LEARN RES, V11, P1109

[3] AutoAugment: Learning Augmentation Strategies from Data [J].

Cubuk, Ekin D. ;

Zoph, Barret ;

Mane, Dandelion ;

Vasudevan, Vijay ;

Le, Quoc V. .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :113-123

[4] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[5] CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization [J].

Hu, Sixing ;

Feng, Mengdan ;

Nguyen, Rang M. H. ;

Lee, Gim Hee .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7258-7267

[6]

Ju HX, 2020, RSC DETECT SCI SER, V16, P47

[7] Mechanisms of visual attention in the human cortex [J].

Kastner, S ;

Ungerleider, LG .

ANNUAL REVIEW OF NEUROSCIENCE, 2000, 23 :315-341

[8] Reservoir Computing Transformer for Image-Text Retrieval [J].

Li, Wenrui ;

Ma, Zhengyu ;

Deng, Liang-Jian ;

Wang, Penghong ;

Shi, Jinqiao ;

Fan, Xiaopeng .

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, :5605-5613

[9] Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization [J].

Lin, Jinliang ;

Zheng, Zhedong ;

Zhong, Zhun ;

Luo, Zhiming ;

Li, Shaozi ;

Yang, Yi ;

Sebe, Nicu .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :3780-3792

[10]

Lin TY, 2015, PROC CVPR IEEE, P5007, DOI 10.1109/CVPR.2015.7299135

← 1 2 →