Accurate UAV Small Object Detection Based on HRFPN and EfficentVMamba

被引：0

作者：

Wu, Shixiao ^{[1
]}

Lu, Xingyuan ^{[2
]}

Guo, Chengcheng ^{[3
,4
]}

Guo, Hong ^{[5
]}

机构：

[1] Wuhan Business Univ, Sch Informat Engn, Wuhan 430056, Peoples R China

[2] Tianjin Univ Technol, Key Lab Comp Vis & Syst, Minist Educ, Tianjin 300384, Peoples R China

[3] Wuhan Coll, Sch Informat Engn, Wuhan 430212, Peoples R China

[4] Wuhan Univ, Sch Elect Informat, Wuhan 430072, Peoples R China

[5] Tianjin Univ Technol, Sch Comp Sci & Engn, Tianjin 300384, Peoples R China

来源：

SENSORS | 2024年 / 24卷 / 15期

关键词：

small object detection; deep learning; HRNet; Mamba; YOLO; feature fusion; NEURAL-NETWORK;

D O I：

10.3390/s24154966

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

(1) Background: Small objects in Unmanned Aerial Vehicle (UAV) images are often scattered throughout various regions of the image, such as the corners, and may be blocked by larger objects, as well as susceptible to image noise. Moreover, due to their small size, these objects occupy a limited area in the image, resulting in a scarcity of effective features for detection. (2) Methods: To address the detection of small objects in UAV imagery, we introduce a novel algorithm called High-Resolution Feature Pyramid Network Mamba-Based YOLO (HRMamba-YOLO). This algorithm leverages the strengths of a High-Resolution Network (HRNet), EfficientVMamba, and YOLOv8, integrating a Double Spatial Pyramid Pooling (Double SPP) module, an Efficient Mamba Module (EMM), and a Fusion Mamba Module (FMM) to enhance feature extraction and capture contextual information. Additionally, a new Multi-Scale Feature Fusion Network, High-Resolution Feature Pyramid Network (HRFPN), and FMM improved feature interactions and enhanced the performance of small object detection. (3) Results: For the VisDroneDET dataset, the proposed algorithm achieved a 4.4% higher Mean Average Precision (mAP) compared to YOLOv8-m. The experimental results showed that HRMamba achieved a mAP of 37.1%, surpassing YOLOv8-m by 3.8% (Dota1.5 dataset). For the UCAS_AOD dataset and the DIOR dataset, our model had a mAP 1.5% and 0.3% higher than the YOLOv8-m model, respectively. To be fair, all the models were trained without a pre-trained model. (4) Conclusions: This study not only highlights the exceptional performance and efficiency of HRMamba-YOLO in small object detection tasks but also provides innovative solutions and valuable insights for future research.

引用

页数：23

共 43 条

[1] Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934]
[2] GCL-YOLO: A GhostConv-Based Lightweight YOLO Network for UAV Small Object Detection
Cao, Jinshan
Bao, Wenshu
Shang, Haixing
Yuan, Ming
Cheng, Qian
[J]. REMOTE SENSING, 2023, 15 (20)
[3] VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results
Du, Dawei
Zhu, Pengfei
Wen, Longyin
Bian, Xiao
Ling, Haibin
Hu, Qinghua
Peng, Tao
Zheng, Jiayu
Wang, Xinyao
Zhang, Yue
Bo, Liefeng
Shi, Hailin
Zhu, Rui
Kumar, Aashish
Li, Aijin
Zinollayev, Almaz
Askergaliyev, Anuar
Schumann, Arne
Mao, Binjie
Lee, Byeongwon
Liu, Chang
Chen, Changrui
Pan, Chunhong
Huo, Chunlei
Yu, Da
Cong, Dechun
Zeng, Dening
Pailla, Dheeraj Reddy
Li, Di
Wang, Dong
Cho, Donghyeon
Zhang, Dongyu
Bai, Furui
Jose, George
Gao, Guangyu
Liu, Guizhong
Xiong, Haitao
Qi, Hao
Wang, Haoran
Qiu, Heqian
Li, Hongliang
Lu, Huchuan
Kim, Ildoo
Kim, Jaekyum
Shen, Jane
Lee, Jihoon
Ge, Jing
Xu, Jingjing
Zhou, Jingkai
Meier, Jonas
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 213 - 226
[4] CenterNet: Keypoint Triplets for Object Detection
Duan, Kaiwen
Bai, Song
Xie, Lingxi
Qi, Honggang
Huang, Qingming
Tian, Qi
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6568 - 6577
[5] Ge Z, 2021, Arxiv, DOI [arXiv:2107.08430, DOI 10.48550/ARXIV.2107.08430]
[6] NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
Ghiasi, Golnaz
Lin, Tsung-Yi
Le, Quoc V.
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7029 - 7038
[7] Gu A, 2021, ADV NEUR IN, V34
[8] Gu AL, 2022, Arxiv, DOI arXiv:2111.00396
[9] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) : 1904 - 1916
[10] Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]

← 1 2 3 4 5 →