Taking a Look at Small-Scale Pedestrians and Occluded Pedestrians

被引：41

作者：

Cao, Jiale ^{[1
]}

Pang, Yanwei ^{[1
]}

Han, Jungong ^{[2
]}

Gao, Bolin ^{[3
]}

Li, Xuelong ^{[4
,5
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin Key Lab Brain Inspired Intelligence Techn, Tianjin 300072, Peoples R China

[2] Univ Warwick, Data Sci Grp, Coventry CV4 7AL, W Midlands, England

[3] China Automot Technol & Res Ctr Co Ltd, Tianjin 300300, Peoples R China

[4] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Peoples R China

[5] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning OPTIMAL, Xian 710072, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2020年 / 29卷

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Small-scale pedestrians; occluded pedestrians; location bootstrap; semantic transition;

D O I：

10.1109/TIP.2019.2957927

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Small-scale pedestrian detection and occluded pedestrian detection are two challenging tasks. However, most state-of-the-art methods merely handle one single task each time, thus giving rise to relatively poor performance when the two tasks, in practice, are required simultaneously. In this paper, it is found that small-scale pedestrian detection and occluded pedestrian detection actually have a common problem, i.e., an inaccurate location problem. Therefore, solving this problem enables to improve the performance of both tasks. To this end, we pay more attention to the predicted bounding box with worse location precision and extract more contextual information around objects, where two modules (i.e., location bootstrap and semantic transition) are proposed. The location bootstrap is used to reweight regression loss, where the loss of the predicted bounding box far from the corresponding ground-truth is upweighted and the loss of the predicted bounding box near the corresponding ground-truth is downweighted. Additionally, the semantic transition adds more contextual information and relieves semantic inconsistency of the skip-layer fusion. Since the location bootstrap is not used at the test stage and the semantic transition is lightweight, the proposed method does not add many extra computational costs during inference. Experiments on the challenging CityPersons and Caltech datasets show that the proposed method outperforms the state-of-the-art methods on the small-scale pedestrians and occluded pedestrians (e.g., 5.20 and 4.73 improvements on the Caltech).

引用

页码：3143 / 3152

页数：10

共 62 条

[41] Cascade Learning by Optimally Partitioning [J].

Pang, Yanwei ;

Cao, Jiale ;

Li, Xuelong .

IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (12) :4148-4161

[42] Large Kernel Matters - Improve Semantic Segmentation by Global Convolutional Network [J].

Peng, Chao ;

Zhang, Xiangyu ;

Yu, Gang ;

Luo, Guiming ;

Sun, Jian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1743-1751

[43]

Redmon J, 2018, Arxiv, DOI [arXiv:1804.02767, DOI 10.48550/ARXIV.1804.02767]

[44] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].

Ren, Shaoqing ;

He, Kaiming ;

Girshick, Ross ;

Sun, Jian .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149

[45] Training Region-based Object Detectors with Online Hard Example Mining [J].

Shrivastava, Abhinav ;

Gupta, Abhinav ;

Girshick, Ross .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :761-769

[46]

Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556

[47] GlanceNets efficient convolutional neural networks with adaptive hard example mining [J].

Sun, Hanqing ;

Pang, Yanwei .

SCIENCE CHINA-INFORMATION SCIENCES, 2018, 61 (10)

[48] Deep Learning Strong Parts for Pedestrian Detection [J].

Tian, Yonglong ;

Luo, Ping ;

Wang, Xiaogang ;

Tang, Xiaoou .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1904-1912

[49] Video Salient Object Detection via Fully Convolutional Networks [J].

Wang, Wenguan ;

Shen, Jianbing ;

Shao, Ling .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (01) :38-49

[50] Repulsion Loss: Detecting Pedestrians in a Crowd [J].

Wang, Xinlong ;

Xiao, Tete ;

Jiang, Yuning ;

Shao, Shuai ;

Sun, Jian ;

Shen, Chunhua .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7774-7783

← 1 2 3 4 5 6 7 →