共 8 条
Unsupervised NIR-VIS Face Recognition via Homogeneous-to-Heterogeneous Learning and Residual-Invariant Enhancement
被引:3
|作者:
Yang, Yiming
[1
]
Hu, Weipeng
[2
]
Hu, Haifeng
[1
]
机构:
[1] Sun Yat Sen Univ, Sch Elect & Informat Technol, Guangzhou 510006, Peoples R China
[2] Nanyang Technol Univ, Sch Elect & Elect Engn EEE, Singapore 639798, Singapore
基金:
中国国家自然科学基金;
关键词:
Face recognition;
Feature extraction;
Task analysis;
Labeling;
Semantics;
Unsupervised learning;
Faces;
NIR-VIS face recognition;
unsupervised learning;
contrastive learning;
residual-invariant enhancement;
REPRESENTATION;
D O I:
10.1109/TIFS.2023.3346176
中图分类号:
TP301 [理论、方法];
学科分类号:
081202 ;
摘要:
Near-Infrared and Visible light (NIR-VIS) face recognition methods have achieved remarkable success in the fields of security surveillance, criminal investigation, and multimedia information retrieval. But the existing methods heavily rely on carefully annotated labels, leading to expensive manual labelling consumption and deployment flexibility. This motivates us to design unsupervised methods to address NIR-VIS recognition without relying on label information. To this end, we propose a novel homogeneous-to-HEterogeneous learning and Residual-invariant Enhancement (HERE) network for Unsupervised NIR-VIS Heterogeneous Face Recognition (NIR-VIS-UHFR). As the name suggests, the optimization of HERE follow a "homogeneous-to-heterogeneous learning" strategy to fully explore complementary and common semantic information across different modalities. During the homogeneous learning phase, Modality-Adversarial Contrastive Learning (MACL) leverages the collaboration of modality contrastive learning and adversarial learning. On the one hand, MACL learns compact and discriminative intra-modal representations for NIR and VIS data, respectively. On the other hand, MACL guarantees that NIR-VIS data conform to the common feature distribution in a shared feature space, effectively reducing modal differences even in the absence of identity information between modalities. In the heterogeneous learning phase, K-reciprocal-Encoding-based Cross-modal Labeling (KECL) is introduced as robust pseudo label estimation to fully explore cross-modal relationships and group cross-modal features into clusters. With the pseudo labels provided by KECL, Refined cross-modal Contrastive Learning (RCL) is developed with modality-invariant averaging initialization and dynamic focus weighting strategies to extract modality-invariant features. Finally, Residual-invariant Representations Enhancement (RRE) mines partial features under the cross-modal face for robust matching. Compared to supervised methods, our unsupervised HERE demonstrates comparable performance on multiple datasets, greater scalability and practicality in deployment by reducing data acquisition requirements and costs.
引用
收藏
页码:2112 / 2126
页数:15
相关论文