RSOD: Real-time small object detection algorithm in UAV-based traffic monitoring

被引:200
作者
Sun, Wei [1 ,2 ]
Dai, Liang [1 ]
Zhang, Xiaorui [2 ,3 ,4 ]
Chang, Pengshuai [1 ]
He, Xiaozheng [5 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Automat, Nanjing 210044, Peoples R China
[2] Jiangsu Collaborat Innovat Ctr Atmospher Environm, Nanjing 210044, Peoples R China
[3] Jiangsu Engn Ctr Network Monitoring, Nanjing 210044, Peoples R China
[4] Nanjing Univ Informat Sci & Technol, Wuxi Res Inst, Wuxi 214100, Jiangsu, Peoples R China
[5] Rensselaer Polytech Inst, Dept Civil & Environm Engn, Troy, MI 12180 USA
基金
中国国家自然科学基金;
关键词
Small object detection; UAV; Feature pyramid network; Squeeze-and-excitation attention mechanism; TEXTURE; IMAGES;
D O I
10.1007/s10489-021-02893-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The prevailing applications of Unmanned Aerial Vehicles (UAVs) in transportation systems promote the development of object detection methods to collect real-time traffic information through UAVs. However, due to the small size and high density of objects from the aerial perspective, most existing algorithms are difficult to accurately process and extract informative features from the traffic images collected by UAVs. To address the challenges, this paper proposes a new real-time small object detection (RSOD) algorithm based on YOLOv3, which improves the small object detection accuracy by (i) using feature maps of a shallower layer containing more fine-grained information for location prediction; (ii) fusing local and global features of shallow and deep feature maps in Feature Pyramid Network(FPN) to enhance the ability to extract more representative features; (iii)assigning weights to output features of FPN and fusing them adaptively; and(iv) improving the excitation layer in Squeeze-and-Excitation attention mechanism to adjust the feature responses of each channel more precisely. Experimental results show that, when the input size is 608 x 608 x 3, the precision of the proposed RSOD algorithm measured by mAP@0.5 is 43.3% and 52.7% on the Visdrone-DET2018 and UAVDT datasets, which is 3.4% and 5.1% higher than those of YOLOv3, respectively.
引用
收藏
页码:8448 / 8463
页数:16
相关论文
共 41 条
[1]  
[Anonymous], 2018, P EUR C COMP VIS ECC
[2]   A comprehensive survey on model compression and acceleration [J].
Choudhary, Tejalal ;
Mishra, Vipul ;
Goswami, Anurag ;
Sarangapani, Jagannathan .
ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (07) :5113-5155
[3]  
Du D., 2018, P EUR C COMP VIS ECC, P370
[4]   A Traffic-Aware Approach for Enabling Unmanned Aerial Vehicles (UAVs) in Smart City Scenarios [J].
El-Sayed, Hesham ;
Chaqfa, Moumena ;
Zeadally, Sherali ;
Puthal, Deepak .
IEEE ACCESS, 2019, 7 :86297-86305
[5]  
Fan H, 2020, 2020 EUR C COMP VIS, P728
[6]   Meta-SSD: Towards Fast Adaptation for Few-Shot Object Detection With Meta-Learning [J].
Fu, Kun ;
Zhang, Tengfei ;
Zhang, Yue ;
Yan, Menglong ;
Chang, Zhonghan ;
Zhang, Zhengyuan ;
Sun, Xian .
IEEE ACCESS, 2019, 7 :77597-77606
[7]   Siamese attentional keypoint network for high performance visual tracking [J].
Gao, Peng ;
Yuan, Ruyue ;
Wang, Fei ;
Xiao, Liyi ;
Fujita, Hamido ;
Zhang, Yan .
KNOWLEDGE-BASED SYSTEMS, 2020, 193
[8]   Learning reinforced attentional representation for end-to-end visual tracking [J].
Gao, Peng ;
Zhang, Qiquan ;
Wang, Fei ;
Xiao, Liyi ;
Fujita, Hamido ;
Zhang, Yan .
INFORMATION SCIENCES, 2020, 517 :52-67
[9]   NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection [J].
Ghiasi, Golnaz ;
Lin, Tsung-Yi ;
Le, Quoc V. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7029-7038
[10]   Towards highly accurate coral texture images classification using deep convolutional neural networks and data augmentation [J].
Gomez-Rios, Anabel ;
Tabik, Siham ;
Luengo, Julian ;
Shihavuddin, A. S. M. ;
Krawczyk, Bartosz ;
Herrera, Francisco .
EXPERT SYSTEMS WITH APPLICATIONS, 2019, 118 :315-328