Depth as Attention for Face Representation Learning

被引：16

作者：

Uppal, Hardik ^{[1
]}

Sepas-Moghaddam, Alireza ^{[1
]}

Greenspan, Michael ^{[1
]}

Etemad, Ali ^{[1
]}

机构：

[1] Queens Univ, Dept Elect & Comp Engn, Kingston, ON K7L 3N6, Canada

来源：

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY | 2021年 / 16卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Face recognition; Feature extraction; Deep learning; Training; Lighting; Support vector machines; Resource description framework; RGB-D face recognition; depth-guided features; attention; multimodal deep network; VISUAL-ATTENTION; RECOGNITION; IDENTIFICATION; FEATURES; DATABASE;

D O I：

10.1109/TIFS.2021.3053458

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Face representation learning solutions have recently achieved great success for various applications such as verification and identification. However, face recognition approaches that are based purely on RGB images rely solely on intensity information, and therefore are more sensitive to facial variations, notably pose, occlusions, and environmental changes such as illumination and background. A novel depth-guided attention mechanism is proposed for deep multi-modal face recognition using low-cost RGB-D sensors. Our novel attention mechanism directs the deep network "where to look" for visual features in the RGB image by focusing the attention of the network using depth features extracted by a Convolution Neural Network (CNN). The depth features help the network focus on regions of the face in the RGB image that contain more prominent person-specific information. Our attention mechanism then uses this correlation to generate an attention map for RGB images from the depth features extracted by the CNN. We test our network on four public datasets, showing that the features obtained by our proposed solution yield better results on the Lock3DFace, CurtinFaces, IIIT-D RGB-D, and KaspAROV datasets which include challenging variations in pose, occlusion, illumination, expression, and time lapse. Our solution achieves average (increased) accuracies of 87.3% (+5.0%), 99.1% (+0.9%), 99.7% (+0.6%) and 95.3%(+0.5%) for the four datasets respectively, thereby improving the state-of-the-art. We also perform additional experiments with thermal images, instead of depth images, showing the high generalization ability of our solution when adopting other modalities for guiding the attention mechanism instead of depth information.

引用

页码：2461 / 2476

页数：16

共 50 条

[21] Human-level face verification with intra-personal factor analysis and deep face representation
Munasinghe, Sarasi
Fookes, Clinton
Sridharan, Sridha
IET BIOMETRICS, 2018, 7 (05) : 467 - 473
[22] LFDA: A Framework for Light Field Depth Estimation With Depth Attention
Kim, Hyeongsik
Han, Seungjin
Kim, Youngseop
IEEE ACCESS, 2024, 12 : 65032 - 65040
[23] Face Recognition via Deep Learning and Constraint Sparse Representation
Zhang J.-W.
Niu S.-Z.
Cao Z.-Y.
Wang X.-Y.
Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2019, 39 (03): : 255 - 261
[24] Face recognition: Sparse Representation vs. Deep Learning
Alskeini, Neamah H.
Kien Nguyen Thanh
Chandran, Vinod
Boles, Wageeh
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON GRAPHICS AND SIGNAL PROCESSING (ICGSP 2018), 2018, : 31 - 37
[25] Learning to pool high-level features for face representation
Huang, Renjie
Ye, Mao
Xu, Pei
Li, Tao
Dou, Yumin
VISUAL COMPUTER, 2015, 31 (12) : 1683 - 1695
[26] Joint learning for face alignment and face transfer with depth image
Xiaoli Wang
Yinglin Zheng
Ming Zeng
Xuan Cheng
Wei Lu
Multimedia Tools and Applications, 2020, 79 : 33993 - 34010
[27] Joint learning for face alignment and face transfer with depth image
Wang, Xiaoli
Zheng, Yinglin
Zeng, Ming
Cheng, Xuan
Lu, Wei
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (45-46) : 33993 - 34010
[28] Lightweight Pig Face Feature Learning Evaluation and Application Based on Attention Mechanism and Two-Stage Transfer Learning
Yin, Zhe
Peng, Mingkang
Guo, Zhaodong
Zhao, Yue
Li, Yaoyu
Zhang, Wuping
Li, Fuzhong
Guo, Xiaohong
AGRICULTURE-BASEL, 2024, 14 (01):
[29] Capsule Attention for Multimodal EEG-EOG Representation Learning With Application to Driver Vigilance Estimation
Zhang, Guangyi
Etemad, Ali
IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2021, 29 : 1138 - 1149
[30] Attention Augmented Face Morph Detection
Aghdaie, Poorya
Soleymani, Sobhan
Nasrabadi, Nasser M.
Dawson, Jeremy
IEEE ACCESS, 2023, 11 : 24281 - 24298

← 1 2 3 4 5 →