Large-Scale Pre-training for Person Re-identification with Noisy Labels

被引：48

作者：

Fu, Dengpan ^{[1
]}

Chen, Dongdong ^{[3
]}

Yang, Hao ^{[2
]}

Bao, Jianmin ^{[2
]}

Yuan, Lu ^{[3
]}

Zhang, Lei ^{[4
]}

Li, Houqiang ^{[1
]}

Wen, Fang ^{[2
]}

Chen, Dong ^{[2
]}

机构：

[1] Univ Sci & Technol China, Hefei, Peoples R China

[2] Microsoft Res, Redmond, WA 98052 USA

[3] Microsoft Cloud AI, Orlando, FL USA

[4] IDEA, Hangzhou, Peoples R China

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR52688.2022.00251

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper aims to address the problem of pre-training for person re-identification (Re-ID) with noisy labels. To setup the pre-training task, we apply a simple online multi-object tracking system on raw videos of an existing unlabeled Re-ID dataset "LUPerson" and build the Noisy Labeled variant called "LUPerson-NL". Since theses ID labels automatically derived from tracklets inevitably contain noises, we develop a large-scale Pre-training framework utilizing Noisy Labels (PNL), which consists of three learning modules: supervised Re-ID learning, prototype-based contrastive learning, and label-guided contrastive learning. In principle, joint learning of these three modules not only clusters similar examples to one prototype, but also rectifies noisy labels based on the prototype assignment. We demonstrate that learning directly from raw videos is a promising alternative for pre-training, which utilizes spatial and temporal correlations as weak supervision. This simple pre-training task provides a scalable way to learn SOTA Re-ID representations from scratch on "LUPerson-NL" without bells and whistles. For example, by applying on the same supervised Re-ID method MGN, our pre-trained model improves the mAP over the unsupervised pre-training counterpart by 5.7%, 2.2%, 2.3% on CUHK03, DukeMTMC, and MSMT17 respectively. Under the small-scale or few-shot setting, the performance gain is even more significant, suggesting a better transferability of the learned representation. Code is available at https://github.com/DengpanFu/LVPerson-NL.

引用

页码：2466 / 2476

页数：11

共 63 条

[1]

[Anonymous], 2020, AAAI

[2]

[Anonymous], 2020, ARXIV200607733

[3] Mixed High-Order Attention Network for Person Re-Identification [J].

Chen, Binghui ;

Deng, Weihong ;

Hu, Jiani .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :371-381

[4] Self-Critical Attention Learning for Person Re-Identification [J].

Chen, Guangyi ;

Lin, Chunze ;

Ren, Liangliang ;

Lu, Jiwen ;

Zhou, Jie .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9636-9645

[5]

Chen T., 2020, Advances in neural information processing systems, P22243

[6] ABD-Net: Attentive but Diverse Person Re-Identification [J].

Chen, Tianlong ;

Ding, Shaojin ;

Xie, Jingyi ;

Yuan, Ye ;

Chen, Wuyang ;

Yang, Yang ;

Ren, Zhou ;

Wang, Zhangyang .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8350-8360

[7]

Chen T, 2020, PR MACH LEARN RES, V119

[8] Beyond triplet loss: a deep quadruplet network for person re-identification [J].

Chen, Weihua ;

Chen, Xiaotang ;

Zhang, Jianguo ;

Huang, Kaiqi .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1320-1329

[9] Knowledge-guided Deep Reinforcement Learning for Interactive Recommendation [J].

Chen, Xiaocong ;

Huang, Chaoran ;

Yao, Lina ;

Wang, Xianzhi ;

Liu, Wei ;

Zhang, Wenjie .

2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,

[10]

Chen Zhirui, 2019, ARXIV191014333, P3

← 1 2 3 4 5 6 7 →