Improved YOLO-V3 with DenseNet for Multi-Scale Remote Sensing Target Detection

被引:116
作者
Xu, Danqing [1 ]
Wu, Yiquan [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Elect & Informat Engn, Nanjing 211106, Peoples R China
关键词
remote sensing image; target detection; multi-scale; YOLO-V3; convolutional neural network; DenseNet; SEGMENTATION;
D O I
10.3390/s20154276
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Remote sensing targets have different dimensions, and they have the characteristics of dense distribution and a complex background. This makes remote sensing target detection difficult. With the aim at detecting remote sensing targets at different scales, a new You Only Look Once (YOLO)-V3-based model was proposed. YOLO-V3 is a new version of YOLO. Aiming at the defect of poor performance of YOLO-V3 in detecting remote sensing targets, we adopted DenseNet (Densely Connected Network) to enhance feature extraction capability. Moreover, the detection scales were increased to four based on the original YOLO-V3. The experiment on RSOD (Remote Sensing Object Detection) dataset and UCS-AOD (Dataset of Object Detection in Aerial Images) dataset showed that our approach performed better than Faster-RCNN, SSD (Single Shot Multibox Detector), YOLO-V3, and YOLO-V3 tiny in terms of accuracy. Compared with original YOLO-V3, the mAP (mean Average Precision) of our approach increased from 77.10% to 88.73% in the RSOD dataset. In particular, the mAP of detecting targets like aircrafts, which are mainly made up of small targets increased by 12.12%. In addition, the detection speed was not significantly reduced. Generally speaking, our approach achieved higher accuracy and gave considerations to real-time performance simultaneously for remote sensing target detection.
引用
收藏
页码:1 / 24
页数:23
相关论文
共 55 条
[1]  
Adarsh P, 2020, INT CONF ADVAN COMPU, P687, DOI [10.1109/icaccs48705.2020.9074315, 10.1109/ICACCS48705.2020.9074315]
[2]   Benchmark Revision for HOG-SVM Pedestrian Detector Through Reinvigorated Training and Evaluation Methodologies [J].
Bilal, Muhammad ;
Hanif, Muhammad Shehzad .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (03) :1277-1287
[3]  
Bochkovskiy A., 2020, YOLOv4: Optimal Speed and Accuracy of Object Detection
[4]   Rotation-reversal invariant HOG cascade for facial expression recognition [J].
Chen, Jinhui ;
Takiguchi, Tetsuya ;
Ariki, Yasuo .
SIGNAL IMAGE AND VIDEO PROCESSING, 2017, 11 (08) :1485-1492
[5]   MDSSD: multi-scale deconvolutional single shot detector for small objects [J].
Cui, Lisha ;
Ma, Rui ;
Lv, Pei ;
Jiang, Xiaoheng ;
Gao, Zhimin ;
Zhou, Bing ;
Xu, Mingliang .
SCIENCE CHINA-INFORMATION SCIENCES, 2020, 63 (02)
[6]   Calculation of the optimal segmentation scale in object-based multiresolution segmentation based on the scene complexity of high-resolution remote sensing images [J].
Feng, Tianjing ;
Ma, Hairong ;
Cheng, Xinwen ;
Zhang, Hongping .
JOURNAL OF APPLIED REMOTE SENSING, 2018, 12 (02)
[7]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[8]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587
[9]  
Guo C., 2019, P IEEECVF C COMPUTER
[10]  
Haque Md Foysal, 2018, [Journal of Korean Institute of Information Technology, 한국정보기술학회논문지], V16, P93, DOI 10.14801/jkiit.2018.16.10.93