Person Re-Identification by Cross-View Multi-Level Dictionary Learning

被引:155
作者
Li, Sheng [1 ]
Shao, Ming [2 ]
Fu, Yun [3 ,4 ]
机构
[1] Adobe Res, San Jose, CA 95110 USA
[2] Univ Massachusetts, Dept Comp & Informat Sci, Dartmouth, MA 02747 USA
[3] Northeastern Univ, Coll Engn, Dept Elect & Comp Engn, Boston, MA 02115 USA
[4] Northeastern Univ, Coll Comp & Informat Sci, Boston, MA 02115 USA
关键词
Dictionary learning; cross-view learning; multi-level representation; person re-identification; RANK;
D O I
10.1109/TPAMI.2017.2764893
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Person re-identification plays an important role in many safety-critical applications. Existing works mainly focus on extracting patch-level features or learning distance metrics. However, the representation power of extracted features might be limited, due to the various viewing conditions of pedestrian images in complex real-world scenarios. To improve the representation power of features, we learn discriminative and robust representations via dictionary learning in this paper. First, we propose a Cross-view Dictionary Learning (CDL) model, which is a general solution to the multi-view learning problem. Inspired by the dictionary learning based domain adaptation, CDL learns a pair of dictionaries from two views. In particular, CDL adopts a projective learning strategy, which is more efficient than the l(1) optimization in traditional dictionary learning. Second, we propose a Cross-view Multi-level Dictionary Learning (CMDL) approach based on CDL. CMDL contains dictionary learning models at different representation levels, including image-level, horizontal part-level, and patch-level. The proposed models take advantages of the view-consistency information, and adaptively learn pairs of dictionaries to generate robust and compact representations for pedestrian images. Third, we incorporate a discriminative regularization term to CMDL, and propose a CMDL-Dis approach which learns pairs of discriminative dictionaries in image-level and part-level. We devise efficient optimization algorithms to solve the proposed models. Finally, a fusion strategy is utilized to generate the similarity scores for test images. Experiments on the public VIPeR, CUHKCampus, iLIDS, GRID and PRID450S datasets show that our approach achieves the state-of-the-art performance.
引用
收藏
页码:2963 / 2977
页数:15
相关论文
共 69 条
[1]   K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J].
Aharon, Michal ;
Elad, Michael ;
Bruckstein, Alfred .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (11) :4311-4322
[2]  
[Anonymous], P INT C MACH LEARN
[3]  
[Anonymous], P ADV NEURAL INFORM
[4]  
[Anonymous], 2013, arXiv
[5]  
[Anonymous], 2012, P 2012 ASIA PACIFIC
[6]  
[Anonymous], P IEEE C COMP VIS PA
[7]  
[Anonymous], 2014, IEEE TPAMI, DOI DOI 10.1109/TPAMI.2014.2369055
[8]  
[Anonymous], 2007, P IEEE INT WORKSH PE
[9]   Multi-view kernel completion [J].
Bhadra, Sahely ;
Kaski, Samuel ;
Rousu, Juho .
MACHINE LEARNING, 2017, 106 (05) :713-739
[10]  
Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962