Light-weight Deep Neural Network for Small Vehicle Detection using Model-scale YOLOv4

被引:0
作者
Kim M. [1 ]
Kim H. [2 ]
Park C. [2 ]
Paik J. [1 ,2 ]
机构
[1] Graduate School of Artificial Intelligence, Chung-Ang University, Seoul
[2] Department of Image, Graduate School of Advanced Imaging Science, Multimedia and Film, Chung-Ang University, Seoul
关键词
Attention mechanism; Deep learning; Light-weight; Small object detection;
D O I
10.5573/IEIESPC.2023.12.5.369
中图分类号
学科分类号
摘要
In this paper, we present a light-weight deep neural network based on an efficiently scaled YOLOv4 model for detecting small objects in drone images. Since drone-captured images mainly contain small objects, we modified the YOLOv4 model by eliminating the head layer responsible for detecting large objects. This modification significantly reduced the model's parameters and processing time for non-maximum suppression (NMS). Moreover, the appropriately scaled model for small object detection can be used on a drone. To achieve a light-weight network for small object detection with minimal performance degradation, we used the attention stacked hourglass network (ASHN) for feature fusion. In extensive experiments, the proposed network outperformed the baseline network in several datasets. © 2023 Institute of Electronics and Information Engineers. All rights reserved.
引用
收藏
页码:369 / 378
页数:9
相关论文
共 39 条
[31]  
Tan M., Pang R., Le Q. V., Efficientdet: Scalable and efficient object detection, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10781-10790, (2020)
[32]  
Du D., Zhu P., Wen L., Bian X., Lin H., Hu Q., Peng T., Zheng J., Wang X., Zhang Y., Et al., Visdronedet2019: The vision meets drone ob- ject detection in image challenge results, Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, (2019)
[33]  
Newell A., Yang K., Deng J., Stacked hourglass networks for human pose estimation, European conference on computer vision, pp. 483-499, (2016)
[34]  
Woo S., Park J., Lee J.-Y., Kweon I. S., Cbam: Convolutional block attention module, Proceedings of the European conference on computer vision (ECCV), pp. 3-19, (2018)
[35]  
Hu J., Shen L., Sun G., Squeeze-and-excitation networks, Pro- ceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132-7141, (2018)
[36]  
Park J., Woo S., Lee J.-Y., Kweon I. S., Bam: Bottleneck attention module, (2018)
[37]  
Du D., Qi Y., Yu H., Yang Y., Duan K., Li G., Zhang W., Huang Q., Tian Q., The unmanned aerial vehicle benchmark: Object detection and tracking, Proceedings of the European Conference on Computer Vision (ECCV), pp. 370-386, (2018)
[38]  
Hsieh M.-R., Lin Y.-L., Hsu W. H., Drone-based object counting by spatially regularized regional proposal network, Proceedings of the IEEE international conference on computer vision, pp. 4145-4153, (2017)
[39]  
Redmon Joseph, Et al., You only look once: Unified, real-time object detection, Proceedings of the IEEE conference on computer vision and pattern recognition, (2016)