FOCUS ON SEMANTIC CONSISTENCY FOR CROSS-DOMAIN CROWD UNDERSTANDING

被引:0
作者
Han, Tao [1 ,2 ]
Gao, Junyu [1 ,2 ]
Yuan, Yuan [1 ,2 ]
Wang, Qi [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Scinence, Xian 710072, Shaanxi, Peoples R China
[2] Northwestern Polytech Univ, Ctr Opt IMagery Anal & Learning OPTIMAL, Xian 710072, Shaanxi, Peoples R China
来源
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Crowd counting; domain adaptation; semantic consistency; adversarial learning;
D O I
10.1109/icassp40776.2020.9054768
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
For pixel-level crowd understanding, it is time-consuming and laborious in data collection and annotation. Some domain adaptation algorithms try to liberate it by training models with synthetic data, and the results in some recent works have proved the feasibility. However, we found that a mass of estimation errors in the background areas impede the performance of the existing methods. In this paper, we propose a domain adaptation method to eliminate it. According to the semantic consistency, a similar distribution in deep layer's features of the synthetic and real-world crowd area, we first introduce a semantic extractor to effectively distinguish crowd and background in high-level semantic information. Besides, to further enhance the adapted model, we adopt adversarial learning to align features in the semantic space. Experiments on three representative real datasets show that the proposed domain adaptation scheme achieves the state-of-the-art for cross-domain counting problems.
引用
收藏
页码:1848 / 1852
页数:5
相关论文
共 29 条
[1]  
[Anonymous], 2019, CVPR
[2]  
Ding XH, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P1942, DOI 10.1109/ICASSP.2018.8461772
[3]   PCC Net: Perspective Crowd Counting via Spatial Convolutional Network [J].
Gao, Junyu ;
Wang, Qi ;
Li, Xuelong .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (10) :3486-3498
[4]   SCAR: Spatial-/channel-wise attention regression networks for crowd counting [J].
Gao, Junyu ;
Wang, Qi ;
Yuan, Yuan .
NEUROCOMPUTING, 2019, 363 :1-8
[5]  
Gao Junyu, 2019, arXiv:1907.02724
[6]  
Hoffman J., 2018, P 35 INT C MACH LEAR, P1994, DOI DOI 10.48550/ARXIV.1711.03213
[7]   Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds [J].
Idrees, Haroon ;
Tayyab, Muhmmad ;
Athrey, Kishan ;
Zhang, Dong ;
Al-Maadeed, Somaya ;
Rajpoot, Nasir ;
Shah, Mubarak .
COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 :544-559
[8]   Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation [J].
Lee, Chen-Yu ;
Batra, Tanmay ;
Baig, Mohammad Haris ;
Ulbricht, Daniel .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10277-10287
[9]   CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes [J].
Li, Yuhong ;
Zhang, Xiaofan ;
Chen, Deming .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1091-1100
[10]   Leveraging Unlabeled Data for Crowd Counting by Learning to Rank [J].
Liu, Xialei ;
van de Weijer, Joost ;
Bagdanov, Andrew D. .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7661-7669