SAN: Selective Alignment Network for Cross-Domain Pedestrian Detection

被引:17
作者
Jiao, Yifan [1 ]
Yao, Hantao [2 ]
Xu, Changsheng [2 ,3 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Proposals; Feature extraction; Detectors; Visualization; Training; Image color analysis; Adaptation models; Cross-domain pedestrian detection; instance-level adaptation network; image-level adaptation network; pedestrian detection;
D O I
10.1109/TIP.2021.3049948
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-domain pedestrian detection, which has been attracting much attention, assumes that the training and test images are drawn from different data distributions. Existing methods focus on aligning the descriptions of whole candidate instances between source and target domains. Since there exists a giant visual difference among the candidate instances, aligning whole candidate instances between two domains cannot overcome the inter-instance difference. Compared with aligning the whole candidate instances, we consider that aligning each type of instances separately is a more reasonable manner. Therefore, we propose a novel Selective Alignment Network for cross-domain pedestrian detection, which consists of three components: a Base Detector, an Image-Level Adaptation Network, and an Instance-Level Adaptation Network. The Image-Level Adaptation Network and Instance-Level Adaptation Network can be regarded as the global-level and local-level alignments, respectively. Similar to the Faster R-CNN, the Base Detector, which is composed of a Feature module, an RPN module and a Detection module, is used to infer a robust pedestrian detector with the annotated source data. Once obtaining the image description extracted by the Feature module, the Image-Level Adaptation Network is proposed to align the image description with an adversarial domain classifier. Given the candidate proposals generated by the RPN module, the Instance-Level Adaptation Network firstly clusters the source candidate proposals into several groups according to their visual features, and thus generates the pseudo label for each candidate proposal. After generating the pseudo labels, we align the source and target domains by maximizing and minimizing the discrepancy between the prediction of two classifiers iteratively. Extensive evaluations on several benchmarks demonstrate the effectiveness of the proposed approach for cross-domain pedestrian detection.
引用
收藏
页码:2155 / 2167
页数:13
相关论文
共 64 条
[1]  
[Anonymous], 4 INT C LEARN REPR I
[2]  
[Anonymous], 2009, BMVC, DOI DOI 10.5244/C.23.91
[3]  
[Anonymous], 2018, P 6 INT C LEARN REPR
[4]   Scalable K-Means++ [J].
Bahmani, Bahman ;
Moseley, Benjamin ;
Vattani, Andrea ;
Kumar, Ravi ;
Vassilvitskii, Sergei .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (07) :622-633
[5]   Ten Years of Pedestrian Detection, What Have We Learned? [J].
Benenson, Rodrigo ;
Omran, Mohamed ;
Hosang, Jan ;
Schiele, Bernt .
COMPUTER VISION - ECCV 2014 WORKSHOPS, PT II, 2015, 8926 :613-627
[6]   Illuminating Pedestrians via Simultaneous Detection & Segmentation [J].
Brazil, Garrick ;
Yin, Xi ;
Liu, Xiaoming .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4960-4969
[7]   Exploring Object Relation in Mean Teacher for Cross-Domain Detection [J].
Cai, Qi ;
Pan, Yingwei ;
Ngo, Chong-Wah ;
Tian, Xinmei ;
Duan, Lingyu ;
Yao, Ting .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11449-11458
[8]   A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection [J].
Cai, Zhaowei ;
Fan, Quanfu ;
Feris, Rogerio S. ;
Vasconcelos, Nuno .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :354-370
[9]   Taking a Look at Small-Scale Pedestrians and Occluded Pedestrians [J].
Cao, Jiale ;
Pang, Yanwei ;
Han, Jungong ;
Gao, Bolin ;
Li, Xuelong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :3143-3152
[10]   Improved Techniques for Adversarial Discriminative Domain Adaptation [J].
Chadha, Aaron ;
Andreopoulos, Yiannis .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :2622-2637