DisP plus V: A Unified Framework for Disentangling Prototype and Variation From Single Sample per Person

被引:12
作者
Pang, Meng [1 ]
Wang, Binghui [2 ]
Ye, Mang [3 ]
Cheung, Yiu-ming [4 ]
Chen, Yiran [5 ]
Wen, Bihan [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] IIT, Dept Comp Sci, Chicago, IL 60616 USA
[3] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
[4] Hong Kong Baptist Univ, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
[5] Duke Univ, Dept Elect & Comp Engn, Durham, NC 27708 USA
基金
新加坡国家研究基金会;
关键词
Prototypes; Face recognition; Faces; Feature extraction; Learning systems; Image reconstruction; Generators; Adversarial learning; disentangled representation; face editing; prototype recovery; single sample per person; UNDERSAMPLED FACE RECOGNITION; TRAINING SAMPLE; REPRESENTATION; IMAGE;
D O I
10.1109/TNNLS.2021.3103194
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Single sample per person face recognition (SSPP FR) is one of the most challenging problems in FR due to the extreme lack of enrolment data. To date, the most popular SSPP FR methods are the generic learning methods, which recognize query face images based on the so-called prototype plus variation (i.e., P+V) model. However, the classic P+V model suffers from two major limitations: 1) it linearly combines the prototype and variation images in the observational pixel-spatial space and cannot generalize to multiple nonlinear variations, e.g., poses, which are common in face images and 2) it would be severely impaired once the enrolment face images are contaminated by nuisance variations. To address the two limitations, it is desirable to disentangle the prototype and variation in a latent feature space and to manipulate the images in a semantic manner. To this end, we propose a novel disentangled prototype plus variation model, dubbed DisP+V, which consists of an encoder-decoder generator and two discriminators. The generator and discriminators play two adversarial games such that the generator nonlinearly encodes the images into a latent semantic space, where the more discriminative prototype feature and the less discriminative variation feature are disentangled. Meanwhile, the prototype and variation features can guide the generator to generate an identity-preserved prototype and the corresponding variation, respectively. Experiments on various real-world face datasets demonstrate the superiority of our DisP+V model over the classic P+V model for SSPP FR. Furthermore, DisP+V demonstrates its unique characteristics in both prototype recovery and face editing/interpolation.
引用
收藏
页码:867 / 881
页数:15
相关论文
共 76 条
[61]   3D Room Layout Estimation From a Single RGB Image [J].
Yan, Chenggang ;
Shao, Biyao ;
Zhao, Hao ;
Ning, Ruixin ;
Zhang, Yongdong ;
Xu, Feng .
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (11) :3014-3024
[62]   Multi-feature multi-manifold learning for single-sample face recognition [J].
Yan, Haibin ;
Lu, Jiwen ;
Zhou, Xiuzhuang ;
Shang, Yuanyuan .
NEUROCOMPUTING, 2014, 143 :134-143
[63]   Adaptive Convolution Local and Global Learning for Class-Level Joint Representation of Facial Recognition With a Single Sample Per Data Subject [J].
Yang, Meng ;
Wen, Wei ;
Wang, Xing ;
Shen, Linlin ;
Gao, Guangwei .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2020, 15 :2469-2484
[64]   Joint and collaborative representation with local adaptive convolution feature for face recognition with single sample per person [J].
Yang, Meng ;
Wang, Xing ;
Zeng, Guohang ;
Shen, Linlin .
PATTERN RECOGNITION, 2017, 66 :117-128
[65]   Sparse Variation Dictionary Learning for Face Recognition with A Single Training Sample Per Person [J].
Yang, Meng ;
Luc Van Gool ;
Zhang, Lei .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :689-696
[66]   Sparse Representation Based Fisher Discrimination Dictionary Learning for Image Classification [J].
Yang, Meng ;
Zhang, Lei ;
Feng, Xiangchu ;
Zhang, David .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 109 (03) :209-232
[67]   Visible-Infrared Person Re-Identification via Homogeneous Augmented Tri-Modal Learning [J].
Ye, Mang ;
Shen, Jianbing ;
Shao, Ling .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2021, 16 :728-739
[68]   PurifyNet: A Robust Person Re-Identification Model With Noisy Labels [J].
Ye, Mang ;
Yuen, Pong C. .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2020, 15 :2655-2666
[69]  
Yi D., 2014, CORR
[70]   Discriminative multi-scale sparse coding for single-sample face recognition with occlusion [J].
Yu, Yu-Feng ;
Dai, Dao-Qing ;
Ren, Chuan-Xian ;
Huan, Ke-Kun .
PATTERN RECOGNITION, 2017, 66 :302-312