Indoor Scene Object Detection Based on Improved YOLOv4 Algorithm

被引:4
作者
Li Weigang [1 ]
Yang Chao [1 ]
Jiang Lin [1 ]
Zhao Yuntao [1 ]
机构
[1] Wuhan Univ Sci & Technol, Engn Res Ctr Met Automat & Measurement Technol, Minist Educ, Wuhan 430081, Hubei, Peoples R China
关键词
object detection; indoor scene; YOLOv4; cross stage partial network; depthwise separable convolution;
D O I
10.3788/LOP202259.1815003
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we proposed an improved YOLOv4 algorithm model to solve the problems of low detection accuracy and slow detection speed of traditional indoor scene object detection methods. First, we constructed an indoor scene object detection dataset. Then, we applied the K-means++ clustering algorithm to optimize the parameters of the priori box and improve the matching degree between the priori box and object. Next, we adjusted the network structure of the original YOLOv4 model and integrated the cross stage partial network architecture into the neck network of the model. This eliminates the gradient information redundancy phenomenon caused by the gradient backpropagation in the feature fusion stage and improves the detection ability for indoor targets. Furthermore, we introduced a depthwise separable convolution module to replace the original 3x3 convolution layer in the model to reduce the model parameters and improve the detection speed. The experimental results show that the improved YOLOv4 algorithm achieves an average accuracy of 83. 0% and a detection speed of 72. 1 frame/s on the indoor scene target detection dataset, which is 3. 2 percentage points and 6 frame/s higher than the original YOLOv4 algorithm, respectively, additionally, the model size is reduced by 36.3%. The improved YOLOv4 algorithm outperforms other indoor scene object detection algorithms based on deep learning.
引用
收藏
页数:10
相关论文
共 27 条
[1]   An Evaluation of RetinaNet on Indoor Object Detection for Blind and Visually Impaired Persons Assistance Navigation [J].
Afif, Mouna ;
Ayachi, Riadh ;
Said, Yahia ;
Pissaloux, Edwige ;
Atri, Mohamed .
NEURAL PROCESSING LETTERS, 2020, 51 (03) :2265-2279
[2]  
Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
[3]   Indoor Scene Recognition Through Object Detection [J].
Espinace, P. ;
Kollar, T. ;
Soto, A. ;
Roy, N. .
2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2010, :1406-1413
[4]  
Howard AG, 2017, Arxiv, DOI [arXiv:1704.04861, DOI 10.48550/ARXIV.1704.04861]
[5]  
He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/ICCV.2017.322, 10.1109/TPAMI.2018.2844175]
[6]  
He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
[7]  
Hinton G, 2015, Arxiv, DOI arXiv:1503.02531
[8]   Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference [J].
Jacob, Benoit ;
Kligys, Skirmantas ;
Chen, Bo ;
Zhu, Menglong ;
Tang, Matthew ;
Howard, Andrew ;
Adam, Hartwig ;
Kalenichenko, Dmitry .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2704-2713
[9]  
Jonghwan Kim, 2011, 2011 8th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI 2011), P746, DOI 10.1109/URAI.2011.6146002
[10]  
[李维刚 Li Weigang], 2020, [电子学报, Acta Electronica Sinica], V48, P1284