Depth-Reshaping Based Aerial Object Detection Enhanced Network

被引：0

作者：

Fu, Tianyi ^{[1
,2
]}

Yang, Benyi ^{[3
,4
]}

Dong, Hongbin ^{[1
,2
]}

Deng, Baosong ^{[3
,4
]}

机构：

[1] College of Computer Science and Technology, Harbin Engineering University, Harbin

[2] National Engineering Laboratory for Modeling and Emulation in E-Government, Harbin Engineering University, Harbin

[3] Defense Innovation Institute(DII), Academy of Military Science, Beijing

[4] Intelligent Game and Decision Laboratory, Academy of Military Science, Beijing

来源：

Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence | 2024年 / 37卷 / 07期

基金：

黑龙江省自然科学基金; 中国国家自然科学基金;

关键词：

Aerial Image; Computer Vision; Deep Learning; Feature Extraction; Object Detection;

D O I：

10.16451/j.cnki.issn1003-6059.202407007

中图分类号：

学科分类号：

摘要：

To address the issues of complex background interference, loss of fine details in small objects and the high demand for detection efficiency in aerial image object detection, a depth-reshaping enhanced network(DR-ENet) is proposed. Firstly, the traditional downsampling methods are replaced by spatial depth-reshaping techniques to reduce information loss during feature extraction and enhance the ability of the network to capture details. Then, a deformable spatial pyramid pooling method is designed to enhance the adaptability of network to object shape variations and its ability to recognize in complex backgrounds. Simultaneously, an attention decoupling detection head is proposed to enhance the learning effectiveness for different detection tasks. Finally, a small-scale aerial dataset, PORT, is constructed to simultaneously consider the characteristics of dense small objects and complex backgrounds. Experiments on three public aerial datasets and PORT dataset demonstrate that DR-ENet achieves performance improvement, proving its effectiveness and high efficiency in aerial image object detection. © 2024 Science Press. All rights reserved.

引用

页码：652 / 662

页数：10

共 42 条

[1]

FU C Y, LIU W, RANGA A, Et al., DSSD: Deconvolutional Single Shot Detector[ C / OL ]

[2]

ZHANG S F, WEN L Y, BIAN X, Et al., Single-Shot Refinement Neural Network for Object Detection, Proc of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition, pp. 4203-4212, (2018)

[3]

LIN T Y, GOYAL P, GIRSHICK R, Et al., Focal Loss for Dense Object Detection, Proc of the IEEE / CVF International Conference on Computer Vision, pp. 2997-3007, (2017)

[4]

LI Z X, YANG L, ZHOU F Q., FSSD: Feature Fusion Single Shot Multibox Detector[ C / OL]

[5]

REDMON J, DIVVALA S, GIRSHICK R, Et al., You Only Look Once: Unified, Real-Time Object Detection, Proc of the IEEE/ CVF International Conference on Computer Vision, pp. 779-788, (2016)

[6]

LIU W, ANGUELOV D, ERHAN D, Et al., SSD: Single Shot Multibox Detector, Proc of the European Conference on Computer Vision, pp. 21-37, (2016)

[7]

REN S Q, HE K M, GIRSHICK R, Et al., Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 6, pp. 1137-1149, (2017)

[8]

HE K M, GKIOXARI G, DOLLAR P, Et al., Mask R-CNN, Proc of the IEEE/ CVF International Conference on Computer Vision, pp. 2980-2988, (2017)

[9]

HE K M, ZHANG X Y, REN S Q, Et al., Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 9, pp. 1904-1916, (2015)

[10]

GIRSHICK R., Fast R-CNN, Proc of the IEEE/ CVF International Conference on Computer Vision, pp. 1440-1448, (2015)

← 1 2 3 4 5 →