Interactive multi-scale feature representation enhancement for small object detection*

被引:13
作者
Zheng, Qiyuan [1 ]
Chen, Ying [1 ]
机构
[1] Jiangnan Univ, Key Lab Adv Proc Control Light Ind Minist Educ, Wuxi 214122, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Small objects; Deep learning; Multi-scale feature fusion;
D O I
10.1016/j.imavis.2021.104128
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the field of detection, there is a wide gap between the performance of small objects and that of medium, large objects. Some studies show that this gap is due to the contradiction between the classification-based backbone and localization. Although the reduction in the feature map size is beneficial for the extraction of abstract features, it will cause the loss of detailed features in the localization as traversing the backbone. Therefore, an interactive multi-scale feature representation enhancement strategy is proposed. This strategy includes two modules: first a multi-scale auxiliary enhancement network is proposed for feature interaction under multiple inputs. We scale the input to multiple scales corresponding to the prediction layers, and only passes through the lightweight extraction module to extract more detailed features for enhancing the original futures. Moreover, an adaptive interaction module is designed to aggregate the features of adjacent layers. This approach provides flexibility in achieving the improvement of small objects detection ability without changing the original network structure. Comprehensive experimental results based on PASCAL VOC and MS COCO datasets show the effectiveness of the proposed method. ? 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:11
相关论文
共 48 条
  • [1] Adelson E.H., 1984, RCA Engineer, V29, P33
  • [2] [Anonymous], CoRR
  • [3] [Anonymous], 2016, ARXIV161206851
  • [4] Bai Y, 2018, CHIN AUTOM CONGR, P4206, DOI 10.1109/CAC.2018.8623571
  • [5] Bochkovskiy A., 2020, ARXIV, DOI DOI 10.48550/ARXIV.2004.10934
  • [6] Dai J., 2016, P NIPS16 30 INT C NE, P379, DOI DOI 10.1109/CVPR.2017.690
  • [7] Duan K., 2019, IEEE T CIRCUITS SYST
  • [8] The PASCAL Visual Object Classes Challenge: A Retrospective
    Everingham, Mark
    Eslami, S. M. Ali
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) : 98 - 136
  • [9] Fu C.Y., 2017, arXiv
  • [10] Girshick R., 2017, P IEEE C COMP VIS PA, DOI [DOI 10.1109/CVPR.2017.106, 10.1109/CVPR.2017.106]