Boost UAV-Based Object Detection via Scale-Invariant Feature Disentanglement and Adversarial Learning

被引：2

作者：

Liu, Fan ^{[1
]}

Yao, Liang ^{[1
]}

Zhang, Chuanyi ^{[2
]}

Wu, Ting ^{[1
]}

Zhang, Xinlei ^{[1
]}

Jiang, Xiruo ^{[3
]}

Zhou, Jun ^{[4
]}

机构：

[1] Hohai Univ, Coll Comp Sci & Software Engn, Nanjing 210098, Peoples R China

[2] Hohai Univ, Coll Artificial Intelligence & Automat, Changzhou 213200, Peoples R China

[3] Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence, Chengdu 611756, Peoples R China

[4] Griffith Univ, Sch Informat & Commun Technol, Nathan, Qld 4111, Australia

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2025年 / 63卷

基金：

中国国家自然科学基金;

关键词：

Feature extraction; Object detection; Autonomous aerial vehicles; Detectors; Accuracy; Training; Representation learning; Head; Benchmark testing; Artificial intelligence; Adversarial learning; feature disentanglement; scale-invariant feature learning; uncrewed aerial vehicle (UAV)-based object detection;

D O I：

10.1109/TGRS.2025.3564261

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Detecting objects from uncrewed aerial vehicles (UAVs) are often hindered by a large number of small objects, resulting in low detection accuracy. To address this issue, mainstream approaches typically utilize multistage inferences. Despite their remarkable detecting accuracies, the real-time efficiency is sacrificed, making them less practical to handle real applications. To this end, we propose to improve the single-stage inference accuracy through learning scale-invariant features. Specifically, a scale-invariant feature disentangling (SIFD) module is designed to disentangle scale-related and scale-invariant features. Then, an adversarial feature learning (AFL) scheme is employed to enhance disentanglement. Finally, scale-invariant features are leveraged for robust UAV-based object detection (UAV-OD). Furthermore, we construct a multimodal UAV object detection dataset, State-Air, which incorporates annotated UAV state parameters. We apply our approach to three lightweight detection frameworks on two benchmark datasets. Extensive experiments demonstrate that our approach can effectively improve model accuracy and achieve state-of-the-art (SoTA) performance on three datasets. Our code and dataset are publicly available at: https://github.com/1e12Leon/SIFDAL

引用

页数：13

共 75 条

[1] Speeded-Up Robust Features (SURF) [J].

Bay, Herbert ;

Ess, Andreas ;

Tuytelaars, Tinne ;

Van Gool, Luc .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) :346-359

[2] Superpixel-Based Multiscale CNN Approach Toward Multiclass Object Segmentation From UAV-Captured Aerial Images [J].

Behera, Tanmay Kumar ;

Bakshi, Sambit ;

Nappi, Michele ;

Sa, Pankaj Kumar .

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 :1771-1784

[3] Improving the Energy Efficiency of Real-time DNN Object Detection via Compression, Transfer Learning, and Scale Prediction [J].

Biswas, Debojyoti ;

Rahman, M. M. Mahabubur ;

Zong, Ziliang ;

Tesic, Jelena .

2022 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE AND STORAGE (NAS), 2022, :156-163

[4]

Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, DOI 10.48550/ARXIV.2004.10934]

[5]

Bozcan I, 2020, IEEE INT CONF ROBOT, P8504, DOI [10.1109/ICRA40945.2020.9196845, 10.1109/icra40945.2020.9196845]

[6]

Dosovitskiy A, 2017, PR MACH LEARN RES, V78

[7] Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images [J].

Du, Bowei ;

Huang, Yecheng ;

Chen, Jiaxin ;

Huang, Di .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :13435-13444

[8] VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results [J].

Du, Dawei ;

Zhu, Pengfei ;

Wen, Longyin ;

Bian, Xiao ;

Ling, Haibin ;

Hu, Qinghua ;

Peng, Tao ;

Zheng, Jiayu ;

Wang, Xinyao ;

Zhang, Yue ;

Bo, Liefeng ;

Shi, Hailin ;

Zhu, Rui ;

Kumar, Aashish ;

Li, Aijin ;

Zinollayev, Almaz ;

Askergaliyev, Anuar ;

Schumann, Arne ;

Mao, Binjie ;

Lee, Byeongwon ;

Liu, Chang ;

Chen, Changrui ;

Pan, Chunhong ;

Huo, Chunlei ;

Yu, Da ;

Cong, Dechun ;

Zeng, Dening ;

Pailla, Dheeraj Reddy ;

Li, Di ;

Wang, Dong ;

Cho, Donghyeon ;

Zhang, Dongyu ;

Bai, Furui ;

Jose, George ;

Gao, Guangyu ;

Liu, Guizhong ;

Xiong, Haitao ;

Qi, Hao ;

Wang, Haoran ;

Qiu, Heqian ;

Li, Hongliang ;

Lu, Huchuan ;

Kim, Ildoo ;

Kim, Jaekyum ;

Shen, Jane ;

Lee, Jihoon ;

Ge, Jing ;

Xu, Jingjing ;

Zhou, Jingkai ;

Meier, Jonas .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :213-226

[9] Coarse-grained Density Map Guided Object Detection in Aerial Images [J].

Duan, Chengzhen ;

Wei, Zhiwei ;

Zhang, Chi ;

Qu, Siying ;

Wang, Hongpeng .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, :2789-2798

[10]

Duda R.O., 1973, Pattern Classification and Scene Analysis, V3

← 1 2 3 4 5 6 7 8 →