Fusing Deep Dilated Convolutions Network and Light-Weight Network for Object Detection

被引:0
|
作者
Quan Y. [1 ]
Li Z.-X. [1 ]
Zhang C.-L. [1 ]
Ma H.-F. [1 ,2 ]
机构
[1] Guangxi Key Lab of Multi-source Information Mining and Security, Guangxi Normal University, Guilin, 541004, Guangxi
[2] College of Computer Science and Engineering, Northwest Normal University, Lanzhou, 730070, Gansu
来源
关键词
Convolution neural network; Dilated convolution network; Image object detection; Light-weight network; Transfer learning;
D O I
10.3969/j.issn.0372-2112.2020.02.023
中图分类号
学科分类号
摘要
Object detection is an important research direction in the field of computer vision. In recent years, object detection has made great advances in public datasets, and there are also breakthroughs in algorithmic performance. In order to improve the accuracy and speed performance of two-stage object detection, this paper proposes a detection model based on transfer learning method that fuses the deep dilated convolutions network and the light-weight network. First, the dilated convolutions network is used to replace the convolutional residual module in the backbone network, namely deep dilated convolution network(D_dNet-65).Then, by compressing the pretrained feature map and adding an 81-class fully connected layer to replace the original two layers, namely light-weight network. Finally, the transfer learning method is introduced in the pretraining to optimize the model (D_dNet and light-weight network).The experiment was carried out on a typical data set, MSCOCO and VOC07.And the experiment shows that the method proposed in this paper has good effectiveness and scalability. © 2020, Chinese Institute of Electronics. All right reserved.
引用
收藏
页码:390 / 397
页数:7
相关论文
共 18 条
  • [1] Xu X.Z., Ding S.F., Shi Z.Z., Et al., New theories and methods of image segmentation, Acta Electronica Sinica, 38, 2, pp. 76-82, (2010)
  • [2] Erhan D., Et al., Scalable object detection using deep neural networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2147-2154, (2014)
  • [3] Girshick R., Donahue J., Darrell T., Et al., Rich feature hierarchies for accurate object detection and semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580-587, (2014)
  • [4] Girshick R., Fast R-CNN, Proceedings of the IEEE International Conference on Computer Vision, pp. 1440-1448, (2015)
  • [5] Ren S., He K., Girshick R., Et al., Faster R-CNN: Towards real-time object detection with region proposal networks, Advances in Neural Information Processing Systems, pp. 91-99, (2015)
  • [6] Dai J., Li Y., He K., Et al., R-FCN: Object detection via region-based fully convolutional networks, Advances in Neural Information Processing Systems, pp. 379-387, (2016)
  • [7] He K., Zhang X., Ren S., Et al., Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, (2016)
  • [8] Lin T.Y., Et al., Feature pyramid networks for object detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117-2125, (2017)
  • [9] Long J., Shelhamer E., Darrell T., Fully convolutional networks for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440, (2015)
  • [10] Vinyals O., Toshev A., Bengio S., Et al., Show and tell: Lessons learned from the 2015 MSCOCO image captioning challenge, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 4, pp. 652-663, (2017)