SA-FPN: An effective feature pyramid network for crowded human detection

被引：114

作者：

Zhou, Xinxin ^{[1
]}

Zhang, Long ^{[1
]}

机构：

[1] Northeast Elect Power Univ, Sch Comp Sci, Jilin 132012, Jilin, Peoples R China

来源：

APPLIED INTELLIGENCE | 2022年 / 52卷 / 11期

关键词：

Object detection; Human detection; Crowd; Convolutional neural networks; Feature pyramid networks; PEDESTRIAN DETECTION;

D O I：

10.1007/s10489-021-03121-8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The crowded scenario not only contains instances at various scales but also introduces a variety of occlusion patterns ranging from non-occluded situations to heavily occluded cases, making the shapes of the instances different. All of those can result in human detectors being hard to apply to them. Feature pyramid networks (FPN), as an indispensable part of generic object detectors, can significantly boost detection performance involving objects at different scales. As a result, in this paper, we equip FPN with a multi-scale feature fusion technology and attention mechanisms to improve the performance of human detection in crowded scenarios. Firstly, we designed a feature pyramid structure with a refined hierarchical-split block, referred to as Scale-FPN, which can better handle the challenging problem of scale variation across object instances. Secondly, an attention-based lateral connection (ALC) module with spatial and channel attention mechanisms was proposed to replace the lateral connection in the FPN, which enhances the representational ability of feature maps through rich spatial and semantic information and lets detectors be capable of focusing on important features of occlusion patterns. Additionally, a bottom-up path augmentation (BPA) module was adopted to exploit the features of the Scale-FPN and ALC modules. To verify the effectiveness of the proposed method, we combined Scale-FPN, ALC and BPA, namely SA-FPN, and integrated it into the architecture of a crowded human detector. Experiments on the challenging CrowdHuman benchmark sufficiently validate the effectiveness of SA-FPN. Specifically, it improves the state-of-the-art result of CrowdDet from 41.4% to 39.9% MR-2, which indicates that the detector with SA-FPN brings in fewer false positives.

引用

页码：12556 / 12568

页数：13

共 59 条

[1]

[Anonymous], P IEEE C COMP VIS PA, DOI [DOI 10.1017/JPA.2016.141, DOI 10.1109/CVPR.2016.141]

[2] Soft-NMS - Improving Object Detection With One Line of Code [J].

Bodla, Navaneeth ;

Singh, Bharat ;

Chellappa, Rama ;

Davis, Larry S. .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570

[3] Computer vision and deep learning techniques for pedestrian detection and tracking: A survey [J].

Brunetti, Antonio ;

Buongiorno, Domenico ;

Trotta, Gianpaolo Francesco ;

Bevilacqua, Vitoantonio .

NEUROCOMPUTING, 2018, 300 :17-33

[4]

Chi S, 2019, RELATIONAL LEARNING

[5] Detection in Crowded Scenes: One Proposal, Multiple Predictions [J].

Chu, Xuangeng ;

Zheng, Anlin ;

Zhang, Xiangyu ;

Sun, Jian .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12211-12220

[6]

Ding E., 2020, HS RESNET HIERARCHIC

[7]

Dollár P, 2009, PROC CVPR IEEE, P304, DOI 10.1109/CVPRW.2009.5206631

[8] Human detection from images and videos: A survey [J].

Duc Thanh Nguyen ;

Li, Wanqing ;

Ogunbona, Philip O. .

PATTERN RECOGNITION, 2016, 51 :148-175

[9] The PASCAL Visual Object Classes Challenge: A Retrospective [J].

Everingham, Mark ;

Eslami, S. M. Ali ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136

[10] Design and Simulation Evaluation of Pneumatic Tool Handle [J].

Ge Zhenghao ;

Wei Siyuan ;

Wei Tao .

2020 IEEE 3RD INTERNATIONAL CONFERENCE ON MECHATRONICS, ROBOTICS AND AUTOMATION (ICMRA 2020), 2020, :1-5

← 1 2 3 4 5 6 →