Camera-Driven Representation Learning for Unsupervised Domain Adaptive Person Re-identification

被引:19
作者
Lee, Geon [1 ]
Lee, Sanghoon [1 ]
Kim, Dohyung [1 ]
Shin, Younghoon [2 ]
Yoon, Yongsang [2 ]
Ham, Bumsub [1 ]
机构
[1] Yonsei Univ, Seoul, South Korea
[2] Hyundai Motor Co, Robot Lab, Seoul, South Korea
来源
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年
关键词
D O I
10.1109/ICCV51070.2023.01052
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel unsupervised domain adaption method for person re-identification (reID) that generalizes a model trained on a labeled source domain to an unlabeled target domain. We introduce a camera-driven curriculum learning (CaCL) framework that leverages camera labels of person images to transfer knowledge from source to target domains progressively. To this end, we divide target domain dataset into multiple subsets based on the camera labels, and initially train our model with a single subset (i.e., images captured by a single camera). We then gradually exploit more subsets for training, according to a curriculum sequence obtained with a camera-driven scheduling rule. The scheduler considers maximum mean discrepancies (MMD) between each subset and the source domain dataset, such that the subset closer to the source domain is exploited earlier within the curriculum. For each curriculum sequence, we generate pseudo labels of person images in a target domain to train a reID model in a supervised way. We have observed that the pseudo labels are highly biased toward cameras, suggesting that person images obtained from the same camera are likely to have the same pseudo labels, even for different IDs. To address the camera bias problem, we also introduce a camera-diversity (CD) loss encouraging person images of the same pseudo label, but captured across various cameras, to involve more for discriminative feature learning, providing person representations robust to inter-camera variations. Experimental results on standard benchmarks, including real-to-real and synthetic-to-real scenarios, demonstrate the effectiveness of our framework.
引用
收藏
页码:11419 / 11428
页数:10
相关论文
共 62 条
[1]   2D Human Pose Estimation: New Benchmark and State of the Art Analysis [J].
Andriluka, Mykhaylo ;
Pishchulin, Leonid ;
Gehler, Peter ;
Schiele, Bernt .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3686-3693
[2]  
[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00200
[3]  
[Anonymous], 2018, CVPR, DOI DOI 10.1109/CVPR.2018.00110
[4]  
[Anonymous], 2016, NeurIPS
[5]  
[Anonymous], 2018, CVPR, DOI DOI 10.1109/CVPR.2018.00016
[6]  
[Anonymous], 2018, TOMM, DOI DOI 10.1145/3243316
[7]  
Bengio Y., 2009, INT C MACH LEARN
[8]  
Can Wang, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12348), P242, DOI 10.1007/978-3-030-58580-8_15
[9]   Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].
Cao, Zhe ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310
[10]  
Dai Yongxing, 2021, ICCV