Adaptive Dilated Network with Self-Correction Supervision for Counting

被引：153

作者：

Bai, Shuai ^{[1
]}

He, Zhiqun ^{[2
]}

Qiao, Yu ^{[3
]}

Hu, Hanzhe ^{[4
]}

Wu, Wei ^{[2
]}

Yan, Junjie ^{[2
]}

机构：

[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

[2] SenseTime Grp Ltd, Singapore, Singapore

[3] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China

[4] Peking Univ, Beijing, Peoples R China

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2020年

关键词：

CROWDED SCENES; PEOPLE; SEGMENTATION; NUMBER;

D O I：

10.1109/CVPR42600.2020.00465

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The counting problem aims to estimate the number of objects in images. Due to large scale variation and labeling deviations, it remains a challenging task. The static density map supervised learning framework is widely used in existing methods, which uses the Gaussian kernel to generate a density map as the learning target and utilizes the Euclidean distance to optimize the model. However, the framework is intolerable to the labeling deviations and can not reflect the scale variation. In this paper, we propose an adaptive dilated convolution and a novel supervised learning framework named self-correction (SC) supervision. In the supervision level, the SC supervision utilizes the outputs of the model to iteratively correct the annotations and employs the SC loss to simultaneously optimize the model from both the whole and the individuals. In the feature level, the proposed adaptive dilated convolution predicts a continuous value as the specific dilation rate for each location, which adapts the scale variation better than a discrete and static dilation rate. Extensive experiments illustrate that our approach has achieved a consistent improvement on four challenging benchmarks. Especially, our approach achieves better performance than the state-of-the-art methods on all benchmark datasets.

引用

页码：4593 / 4602

页数：10

共 51 条

[1]

[Anonymous], 2018, ARXIV180700601

[2]

Brostow G.J., 2006, CVPR, P594

[3] Scale Aggregation Network for Accurate and Efficient Crowd Counting [J].

Cao, Xinkun ;

Wang, Zhipeng ;

Zhao, Yanyun ;

Su, Fei .

COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 :757-773

[4] Privacy preserving crowd monitoring: Counting people without people models or tracking [J].

Chan, Antoni B. ;

Liang, Zhang-Sheng John ;

Vasconcelos, Nuno .

2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, :1766-1772

[5]

Chattopadhyay Prithvijit, 2017, CVPR

[6] Cumulative Attribute Space for Age and Crowd Density Estimation [J].

Chen, Ke ;

Gong, Shaogang ;

Xiang, Tao ;

Loy, Chen Change .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :2467-2474

[7] Scale Pyramid Network for Crowd Counting [J].

Chen, Xinya ;

Bin, Yanrui ;

Sang, Nong ;

Gao, Changxin .

2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :1941-1950

[8] Learning Spatial Awareness to Improve Crowd Counting [J].

Cheng, Zhi-Qi ;

Li, Jun-Xiu ;

Dai, Qi ;

Wu, Xiao ;

Hauptmann, Alexander G. .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6151-6160

[9] Deformable Convolutional Networks [J].

Dai, Jifeng ;

Qi, Haozhi ;

Xiong, Yuwen ;

Li, Yi ;

Zhang, Guodong ;

Hu, Han ;

Wei, Yichen .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773

[10] An Aggregated Multicolumn Dilated Convolution Network for Perspective-Free Counting [J].

Deb, Diptodip ;

Ventura, Jonathan .

PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, :308-317

← 1 2 3 4 5 6 →