Enhanced semantic feature pyramid network for small object detection

被引:16
作者
Chen, Yuqi [1 ]
Zhu, Xiangbin [1 ]
Li, Yonggang [2 ]
Wei, Yuanwang [2 ]
Ye, Lihua [2 ]
机构
[1] Zhejiang Normal Univ, Sch Math & Comp Sci, Jinhua 321004, Peoples R China
[2] Jiaxing Univ, Sch Informat Sci & Engn, Jiaxing 314001, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature pyramid network; Small object detection; Semantic enhancement; Sub-pixel convolution; Deformable convolution;
D O I
10.1016/j.image.2023.116919
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Feature-pyramid network-based models, which progressively fuse multi-scale features, have been proven highly effective in object detection. However, these models often learn multi-scale features with ambiguous boundaries, due to small objects with only a few pixels that easily lose information during top-down propagation, which makes multi-scale feature representation less effective. In this work, we propose an efficient Enhanced Semantic Feature Pyramid Network(ES-FPN), which combines semantic information at high-level with contextual information at low-level to improve multi-scale feature learning in small object detection. Specifically, the proposed network first exploits the rich semantic information in lateral connections that enables the features to be more semantic. Then, it excavates the lost information in high-level/low-res feature maps with rich contextual information in low-level/high-res. In this way, the high-level layers suffer the reduced loss of important contextual information during the progressive feature fusion that avoids object disappearance, which is useful to utilize rich semantic information in high-level. Finally, ES-FPN fuses the distributed features of each layer stage-by-stage and the final features are more semantically and better for localizing the object. Extensive experimental results over three widely used object detection benchmarks(MS COCO, VOC and Cityscapes) demonstrate that our network can accurately locate fairly complete objects with clear boundaries and outperforms previous feature pyramid-based methods.
引用
收藏
页数:10
相关论文
共 62 条
[1]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[2]  
Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934, DOI 10.48550/ARXIV.2004.10934]
[3]   Cascade R-CNN: Delving into High Quality Object Detection [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162
[4]   Super-resolution through neighbor embedding [J].
Chang, H ;
Yeung, DY ;
Xiong, Y .
PROCEEDINGS OF THE 2004 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, 2004, :275-282
[5]   Hybrid Task Cascade for Instance Segmentation [J].
Chen, Kai ;
Pang, Jiangmiao ;
Wang, Jiaqi ;
Xiong, Yu ;
Li, Xiaoxiao ;
Sun, Shuyang ;
Feng, Wansen ;
Liu, Ziwei ;
Shi, Jianping ;
Ouyang, Wanli ;
Loy, Chen Change ;
Lin, Dahua .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4969-4978
[6]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[7]   Attentional Feature Fusion [J].
Dai, Yimian ;
Gieseke, Fabian ;
Oehmcke, Stefan ;
Wu, Yiquan ;
Barnard, Kobus .
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, :3559-3568
[8]   Learning Affinity-Aware Upsampling for Deep Image Matting [J].
Dai, Yutong ;
Lu, Hao ;
Shen, Chunhua .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :6837-6846
[9]   Extended Feature Pyramid Network for Small Object Detection [J].
Deng, Chunfang ;
Wang, Mengmeng ;
Liu, Liang ;
Liu, Yong ;
Jiang, Yunliang .
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 :1968-1979
[10]   Detecting Small Objects Using a Channel-Aware Deconvolutional Network [J].
Duan, Kaiwen ;
Du, Dawei ;
Qi, Honggang ;
Huang, Qingming .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (06) :1639-1652