Vision-Based UAV Self-Positioning in Low-Altitude Urban Environments

被引:23
作者
Dai, Ming [1 ]
Zheng, Enhui [2 ]
Feng, Zhenhua [3 ]
Qi, Lei [4 ]
Zhuang, Jiedong [5 ]
Yang, Wankou [1 ]
机构
[1] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China
[2] China Jiliang Univ, Unmanned Syst Applicat Technol Res Inst, Hangzhou 310018, Peoples R China
[3] Univ Surrey, Sch Comp Sci & Elect Engn, Guildford GU2 7XH, England
[4] Southeast Univ, Sch Comp Sci, Nanjing 210096, Peoples R China
[5] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou 310063, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous aerial vehicles; Task analysis; Satellite images; Satellites; Location awareness; Drones; Web services; Unmanned aerial vehicle; geo-localization; transformer; image retrieval; NETWORK;
D O I
10.1109/TIP.2023.3346279
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unmanned Aerial Vehicles (UAVs) rely on satellite systems for stable positioning. However, due to limited satellite coverage or communication disruptions, UAVs may lose signals for positioning. In such situations, vision-based techniques can serve as an alternative, ensuring the self-positioning capability of UAVs. However, most of the existing datasets are developed for the geo-localization task of the objects captured by UAVs, rather than UAV self-positioning. Furthermore, the existing UAV datasets apply discrete sampling to synthetic data, such as Google Maps, neglecting the crucial aspects of dense sampling and the uncertainties commonly experienced in practical scenarios. To address these issues, this paper presents a new dataset, DenseUAV, that is the first publicly available dataset tailored for the UAV self-positioning task. DenseUAV adopts dense sampling on UAV images obtained in low-altitude urban areas. In total, over 27K UAV- and satellite-view images of 14 university campuses are collected and annotated. In terms of methodology, we first verify the superiority of Transformers over CNNs for the proposed task. Then we incorporate metric learning into representation learning to enhance the model's discriminative capacity and to reduce the modality discrepancy. Besides, to facilitate joint learning from both the satellite and UAV views, we introduce a mutually supervised learning approach. Last, we enhance the Recall@K metric and introduce a new measurement, SDM@K, to evaluate both the retrieval and localization performance for the proposed task. As a result, the proposed baseline method achieves a remarkable Recall@1 score of 83.01% and an SDM@1 score of 86.50% on DenseUAV. The dataset and code have been made publicly available on https://github.com/Dmmm1997/DenseUAV.
引用
收藏
页码:493 / 508
页数:16
相关论文
共 65 条
[11]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[12]  
Hinton G, 2015, Arxiv, DOI arXiv:1503.02531
[13]   Multi-Target Multi-Camera Tracking of Vehicles Using Metadata-Aided Re-ID and Trajectory-Based Camera Link Model [J].
Hsu, Hung-Min ;
Cai, Jiarui ;
Wang, Yizhou ;
Hwang, Jenq-Neng ;
Kim, Kwang-Ju .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :5198-5210
[14]   Image-Based Geo-Localization Using Satellite Imagery [J].
Hu, Sixing ;
Lee, Gim Hee .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (05) :1205-1219
[15]   CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization [J].
Hu, Sixing ;
Feng, Mengdan ;
Nguyen, Rang M. H. ;
Lee, Gim Hee .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7258-7267
[16]   Pareto Refocusing for Drone-View Object Detection [J].
Leng, Jiaxu ;
Mo, Mengjingcheng ;
Zhou, Yinghua ;
Gao, Chenqiang ;
Li, Weisheng ;
Gao, Xinbo .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) :1320-1334
[17]   Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization [J].
Lin, Jinliang ;
Zheng, Zhedong ;
Zhong, Zhun ;
Luo, Zhiming ;
Li, Shaozi ;
Yang, Yi ;
Sebe, Nicu .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :3780-3792
[18]   Focal Loss for Dense Object Detection [J].
Lin, Tsung-Yi ;
Goyal, Priya ;
Girshick, Ross ;
He, Kaiming ;
Dollar, Piotr .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (02) :318-327
[19]  
Lin TY, 2015, PROC CVPR IEEE, P5007, DOI 10.1109/CVPR.2015.7299135
[20]   End-to-End Comparative Attention Networks for Person Re-Identification [J].
Liu, Hao ;
Feng, Jiashi ;
Qi, Meibin ;
Jiang, Jianguo ;
Yan, Shuicheng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) :3492-3506