CIG2S: A Cross-View Image Geo-Localization Model Based on G2S Transform Suitable for Center-Misaligned Scenarios

被引：0

作者：

Li, Jiangshan ^{[1
,2
]}

Yang, Chunfang ^{[3
,4
]}

Qi, Baojun ^{[3
,4
]}

Zhu, Ma ^{[3
,4
]}

Chen, Junyang ^{[5
]}

Leung, Victor C. M. ^{[5
]}

机构：

[1] Zhengzhou Univ, Sch Cyber Sci & Engn, Zhengzhou 450002, Peoples R China

[2] Natl Univ Def Technol, Coll Elect Sci & Technol, Changsha 410073, Peoples R China

[3] Henan Key Lab Cyberspace Situat Awareness, Zhengzhou 450001, Peoples R China

[4] Minist Educ, Key Lab Cyberspace Secur, Zhengzhou 450001, Peoples R China

[5] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China

来源：

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS | 2024年

基金：

中国国家自然科学基金;

关键词：

Satellite images; Transforms; Feature extraction; Satellites; Location awareness; Accuracy; Loss measurement; Generative adversarial networks; Correlation; Weight measurement; Cross-view image geo-localization; image retrieval; loss function; visual geo-localization;

D O I：

10.1109/TCSS.2024.3465539

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In multimedia social networks, the user's geolocation can be inferred by matching his shared images with the referenced satellite images, viz. cross-view image geo-localization. Although the existing most cross-view image geo-localization methods perform well in the center-misaligned scenario, in practical application, the shooting location of the query ground image is most likely not aligned with the center point of satellite images. Then, their geo-localization accuracy would drastically decrease. Therefore, we propose a novel cross-view image geo-localization model based on ground-to-satellite (G2S) transform, named CIG2S. First, the queried ground image is transformed into the aerial-view by spherical transform, generating G2S images, which could improve the similarity between ground and satellite images. Second, multiscale features are extracted from the original ground image, G2S images, and satellite images by twins-PCPVT. Furthermore, a dynamic similarity weighted loss function is designed to measure the distance between the query ground image and the referenced satellite image. Experimental results on three center-misaligned datasets, including VIGOR and the center-misaligned versions of CVUSA and CVACT, demonstrate that the proposed CIG2S model can significantly improve the geo-localization accuracy. For example, when compared with another vision-transformer-based model L2LTR-polar, CIG2S can outperform about 6.6% and 15.8% in the center-misaligned datasets CVUSA_CM and CVACT_CM.

引用

页数：16

共 50 条

[1]

Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/TPAMI.2017.2711011, 10.1109/CVPR.2016.572]

[2] Ground-to-Aerial Image Geo-Localization ith a Hard Exemplar Reweighting Triplet Loss [J].

Cai, Sudong ;

Guo, Yulan ;

Khan, Salman ;

Hu, Jiwei ;

Wen, Gongjian .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8390-8399

[3] End-to-end Learning Improves Static Object Geo-localization from Video [J].

Chaabane, Mohamed ;

Gueguen, Lionel ;

Trabelsi, Ameni ;

Beveridge, Ross ;

O'Hara, Stephen .

2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, :2062-2071

[4]

Chu XX, 2021, ADV NEUR IN

[5]

Chu XX, 2021, Arxiv, DOI [arXiv:2102.10882, 10.48550/arXiv.2102.10882]

[6] What Is It Like Down There? Generating Dense Ground-Level Views and Image Features From Overhead Imagery Using Conditional Generative Adversarial Networks [J].

Deng, Xueqing ;

Zhu, Yi ;

Newsam, Shawn .

26TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2018), 2018, :43-52

[7] CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows [J].

Dong, Xiaoyi ;

Bao, Jianmin ;

Chen, Dongdong ;

Zhang, Weiming ;

Yu, Nenghai ;

Yuan, Lu ;

Chen, Dong ;

Guo, Baining .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :12114-12124

[8]

Dosovitskiy A., 2020, P INT C LEARN REPR J, P1

[9]

Foret P., 2020, P INT C LEARN REPR, P1

[10] Fusing Geometric and Scene Information for Cross-View Geo-Localization [J].

Guo, Siyuan ;

Liu, Tianying ;

Li, Wengen ;

Guan, Jihong ;

Zhou, Shuigeng .

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, :3978-3982

← 1 2 3 4 5 →