DV-DETR: Improved UAV Aerial Small Target Detection Algorithm Based on RT-DETR

被引:8
作者
Wei, Xiaolong [1 ]
Yin, Ling [1 ]
Zhang, Liangliang [1 ]
Wu, Fei [1 ]
机构
[1] Shanghai Univ Engn Sci, Sch Elect & Elect Engn, Shanghai 201620, Peoples R China
关键词
transformer; small target detection; real-time task; RT-DETR algorithm;
D O I
10.3390/s24227376
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
For drone-based detection tasks, accurately identifying small-scale targets like people, bicycles, and pedestrians remains a key challenge. In this paper, we propose DV-DETR, an improved detection model based on the Real-Time Detection Transformer (RT-DETR), specifically optimized for small target detection in high-density scenes. To achieve this, we introduce three main enhancements: (1) ResNet18 as the backbone network to improve feature extraction and reduce model complexity; (2) the integration of recalibration attention units and deformable attention mechanisms in the neck network to enhance multi-scale feature fusion and improve localization accuracy; and (3) the use of the Focaler-IoU loss function to better handle the imbalanced distribution of target scales and focus on challenging samples. Experimental results on the VisDrone2019 dataset show that DV-DETR achieves an mAP@0.5 of 50.1%, a 1.7% improvement over the baseline model, while increasing detection speed from 75 FPS to 90 FPS, meeting real-time processing requirements. These improvements not only enhance the model's accuracy and efficiency but also provide practical significance in complex, high-density urban environments, supporting real-world applications in UAV-based surveillance and monitoring tasks.
引用
收藏
页数:15
相关论文
共 24 条
[1]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[2]   RepVGG: Making VGG-style ConvNets Great Again [J].
Ding, Xiaohan ;
Zhang, Xiangyu ;
Ma, Ningning ;
Han, Jungong ;
Ding, Guiguang ;
Sun, Jian .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13728-13737
[3]   VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results [J].
Du, Dawei ;
Zhu, Pengfei ;
Wen, Longyin ;
Bian, Xiao ;
Ling, Haibin ;
Hu, Qinghua ;
Peng, Tao ;
Zheng, Jiayu ;
Wang, Xinyao ;
Zhang, Yue ;
Bo, Liefeng ;
Shi, Hailin ;
Zhu, Rui ;
Kumar, Aashish ;
Li, Aijin ;
Zinollayev, Almaz ;
Askergaliyev, Anuar ;
Schumann, Arne ;
Mao, Binjie ;
Lee, Byeongwon ;
Liu, Chang ;
Chen, Changrui ;
Pan, Chunhong ;
Huo, Chunlei ;
Yu, Da ;
Cong, Dechun ;
Zeng, Dening ;
Pailla, Dheeraj Reddy ;
Li, Di ;
Wang, Dong ;
Cho, Donghyeon ;
Zhang, Dongyu ;
Bai, Furui ;
Jose, George ;
Gao, Guangyu ;
Liu, Guizhong ;
Xiong, Haitao ;
Qi, Hao ;
Wang, Haoran ;
Qiu, Heqian ;
Li, Hongliang ;
Lu, Huchuan ;
Kim, Ildoo ;
Kim, Jaekyum ;
Shen, Jane ;
Lee, Jihoon ;
Ge, Jing ;
Xu, Jingjing ;
Zhou, Jingkai ;
Meier, Jonas .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :213-226
[4]  
G. I. O. Union, 2019, P IEEE CVF C COMP VI, P658
[5]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587
[6]   A Survey on Vision Transformer [J].
Han, Kai ;
Wang, Yunhe ;
Chen, Hanting ;
Chen, Xinghao ;
Guo, Jianyuan ;
Liu, Zhenhua ;
Tang, Yehui ;
Xiao, An ;
Xu, Chunjing ;
Xu, Yixing ;
Yang, Zhaohui ;
Zhang, Yiman ;
Tao, Dacheng .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) :87-110
[7]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[8]   DN-DETR: Accelerate DETR Training by Introducing Query DeNoising [J].
Li, Feng ;
Zhang, Hao ;
Liu, Shilong ;
Guo, Jian ;
Ni, Lionel M. ;
Zhang, Lei .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :13609-13617
[9]   Target Detection Algorithm of UAV Aerial Image Based on Improved YOLOv5 [J].
Li, Xiaolin ;
Liu, Dadong ;
Liu, Xinman ;
Chen, Ze .
Computer Engineering and Applications, 2024, 60 (11) :204-214
[10]   Geodesic Distance Based Scattering Power Decomposition for Compact Polarimetric SAR Data [J].
Muhuri, Arnab ;
Goita, Kalifa ;
Magagi, Ramata ;
Wang, Hongquan .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61