An empirical study of multi-scale object detection in high resolution UAV images

被引:54
作者
Zhang, Haijun [1 ]
Sun, Mingshan [1 ]
Li, Qun [1 ]
Liu, Linlin [1 ]
Liu, Ming [2 ]
Ji, Yuzhu [1 ]
机构
[1] Harbin Inst Technol, Dept Comp Sci, Shenzhen 518055, Peoples R China
[2] Harbin Inst Technol, Sch Astronaut, Harbin 150001, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
UAV image; Object detection; High resolution; Multi-scale; Data set; VEHICLE DETECTION;
D O I
10.1016/j.neucom.2020.08.074
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection in images collected by Unmanned Aerial Vehicles (UAVs) constitutes a challenging task in computer vision, due to difficulties of learning a well-trained object detection model for handling instances in UAV images with arbitrary orientations, variation in different scales, irregular shapes, etc. In order to facilitate object detection research and extend its applications in natural scenarios by using UAVs, this paper presents a large-scale benchmark dataset, MOHR, aiming at performing multi-scale object detection in UAV images with high resolution. A total of 90,014 object instances with labels and bounding boxes were annotated. In order to build a baseline for object detection on the MOHR dataset, we performed an empirical study by evaluating six state-of-the-art deep learning-based object detection models trained on our proposed dataset. Experimental results show promising detection performance, but also demonstrate that the dataset is quite challenging for adopting natural image-based object detection models for UAV images. (c) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:173 / 182
页数:10
相关论文
共 32 条
[1]  
[Anonymous], 2017, ABS170306211 CORR, DOI DOI 10.1109/CVPR.2014.220
[2]  
[Anonymous], 2017, SENSORS-BASEL, DOI DOI 10.3390/s17020336
[3]  
[Anonymous], 2018, IEEE C COMPUTER VISI
[4]  
[Anonymous], 2009, P APPL IM PATT REC W, DOI DOI 10.1109/AIPR.2009.5466304
[5]  
[Anonymous], PROC CVPR IEEE
[6]  
[Anonymous], 2018, P CVPR
[7]  
[Anonymous], 2016, P COMPUTER VISION EC, DOI DOI 10.1007/978-3-319-46448-0_2
[8]  
Chen X, 2015, Microsoft coco captions: Data collection and evaluation server," in, V1504, P325
[9]   Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images [J].
Cheng, Gong ;
Zhou, Peicheng ;
Han, Junwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (12) :7405-7415
[10]  
Dai J, 2016, PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), P1796, DOI 10.1109/ICIT.2016.7475036