CB-FPN: object detection feature pyramid network based on context information and bidirectional efficient fusion

被引:9
作者
Liu, Zhibo [1 ]
Cheng, Jian [1 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei, Peoples R China
关键词
Object detection; Context information; Deep learning; Feature pyramid network;
D O I
10.1007/s10044-023-01173-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature pyramid network (FPN) is a typical structure in object detection. It can improve the accuracy of detection results by fusing feature information at different resolutions and enhancing the expression ability of different levels of features. Among them, the mismatch between the resolution of feature information and the receptive field and the limited way of feature fusion hinder the full exchange of feature information. To solve the above problems, this paper designs a new structure called an object detection feature pyramid network based on context information and an efficient bidirectional fusion (CB-FPN): (1) Before feature fusion, this study designs a context enhancement module with cross stage partial network (CSPNet) module (CEM-CSP). By using carefully designed dilated convolutions on high- level features, rich context information and receptive fields are obtained to match appropriate feature information. (2) In feature fusion, this study designed a bidirectional efficient feature pyramid network (BE-FPN) module to fuse features efficiently. After adding these two modified architectures to Faster R-CNN with ResNet-50, the average precision (AP) improves from 37.5 to 39.2 on COCO val-2017 data set. In addition, extensive experiments show the effectiveness of our methods on one-stage, two-stage, and anchor-free models.
引用
收藏
页码:1441 / 1452
页数:12
相关论文
共 33 条
  • [1] Cascade R-CNN: Delving into High Quality Object Detection
    Cai, Zhaowei
    Vasconcelos, Nuno
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6154 - 6162
  • [2] Cao JX, 2020, Arxiv, DOI arXiv:2005.11475
  • [3] Deep learning for neurodegenerative disorder (2016 to 2022): A systematic review
    Chaki, Jyotismita
    Wozniak, Marcin
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 80
  • [4] SAANet: Spatial adaptive alignment network for object detection in automatic driving
    Chen, Junying
    Bai, Tongyao
    [J]. IMAGE AND VISION COMPUTING, 2020, 94 (94)
  • [5] Chen K, 2019, Arxiv, DOI arXiv:1906.07155
  • [6] Iterative Visual Reasoning Beyond Convolutions
    Chen, Xinlei
    Li, Li-Jia
    Li Fei-Fei
    Gupta, Abhinav
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7239 - 7248
  • [7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [8] The Pascal Visual Object Classes (VOC) Challenge
    Everingham, Mark
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
  • [9] NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
    Ghiasi, Golnaz
    Lin, Tsung-Yi
    Le, Quoc V.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7029 - 7038
  • [10] Guo CX, 2020, PROC CVPR IEEE, P12592, DOI 10.1109/CVPR42600.2020.01261