CycAs: Self-supervised Cycle Association for Learning Re-identifiable Descriptions

被引：68

作者：

Wang, Zhongdao ^{[1
]}

Zhang, Jingwei ^{[1
]}

Zheng, Liang ^{[2
]}

Liu, Yixuan ^{[1
]}

Sun, Yifan ^{[3
]}

Li, Yali ^{[1
]}

Wang, Shengjin ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China

[2] Australian Natl Univ, Canberra, Australia

[3] MEGVII Technol, Beijing, Peoples R China

来源：

COMPUTER VISION - ECCV 2020, PT XI | 2020年 / 12356卷

关键词：

Self-supervised; Cycle consistency; person re-ID;

D O I：

10.1007/978-3-030-58621-8_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a self-supervised learning method for the person re-identification (re-ID) problem, where existing unsupervised methods usually rely on pseudo labels, such as those from video tracklets or clustering. A potential drawback of using pseudo labels is that errors may accumulate and it is challenging to estimate the number of pseudo IDs. We introduce a different unsupervised method that allows us to learn pedestrian embeddings from raw videos, without resorting to pseudo labels. The goal is to construct a self-supervised pretext task that matches the person re-ID objective. Inspired by the data association concept in multi-object tracking, we propose the Cycle Association (CycAs) task: after performing data association between a pair of video frames forward and then backward, a pedestrian instance is supposed to be associated to itself. To fulfill this goal, the model must learn a meaningful representation that can well describe correspondences between instances in frame pairs. We adapt the discrete association process to a differentiable form, such that end-to-end training becomes feasible. Experiments are conducted in two aspects: We first compare our method with existing unsupervised re-ID methods on seven benchmarks and demonstrate CycAs' superiority. Then, to further validate the practical value of CycAs in real-world applications, we perform training on self-collected videos and report promising performance on standard test sets.

引用

页码：72 / 88

页数：17

共 45 条

[1] Deep Clustering for Unsupervised Learning of Visual Features [J].

Caron, Mathilde ;

Bojanowski, Piotr ;

Joulin, Armand ;

Douze, Matthijs .

COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :139-156

[2]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[3] Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification [J].

Deng, Weijian ;

Zheng, Liang ;

Ye, Qixiang ;

Kang, Guoliang ;

Yang, Yi ;

Jiao, Jianbin .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :994-1003

[4] Unsupervised Visual Representation Learning by Context Prediction [J].

Doersch, Carl ;

Gupta, Abhinav ;

Efros, Alexei A. .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1422-1430

[5] Unsupervised Person Re-identification: Clustering and Fine-tuning [J].

Fan, Hehe ;

Zheng, Liang ;

Yan, Chenggang ;

Yang, Yi .

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2018, 14 (04)

[6]

Gidaris S., 2018, P INT C LEARN REPR

[7] Digging Into Self-Supervised Monocular Depth Estimation [J].

Godard, Clement ;

Mac Aodha, Oisin ;

Firman, Michael ;

Brostow, Gabriel .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3827-3837

[8] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[9]

Hirzer M, 2011, LECT NOTES COMPUT SC, V6688, P91, DOI 10.1007/978-3-642-21227-7_9

[10]

Lai Z., 2019, BMVC

← 1 2 3 4 5 →