Depth as Attention for Face Representation Learning

被引:16
作者
Uppal, Hardik [1 ]
Sepas-Moghaddam, Alireza [1 ]
Greenspan, Michael [1 ]
Etemad, Ali [1 ]
机构
[1] Queens Univ, Dept Elect & Comp Engn, Kingston, ON K7L 3N6, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Face recognition; Feature extraction; Deep learning; Training; Lighting; Support vector machines; Resource description framework; RGB-D face recognition; depth-guided features; attention; multimodal deep network; VISUAL-ATTENTION; RECOGNITION; IDENTIFICATION; FEATURES; DATABASE;
D O I
10.1109/TIFS.2021.3053458
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Face representation learning solutions have recently achieved great success for various applications such as verification and identification. However, face recognition approaches that are based purely on RGB images rely solely on intensity information, and therefore are more sensitive to facial variations, notably pose, occlusions, and environmental changes such as illumination and background. A novel depth-guided attention mechanism is proposed for deep multi-modal face recognition using low-cost RGB-D sensors. Our novel attention mechanism directs the deep network "where to look" for visual features in the RGB image by focusing the attention of the network using depth features extracted by a Convolution Neural Network (CNN). The depth features help the network focus on regions of the face in the RGB image that contain more prominent person-specific information. Our attention mechanism then uses this correlation to generate an attention map for RGB images from the depth features extracted by the CNN. We test our network on four public datasets, showing that the features obtained by our proposed solution yield better results on the Lock3DFace, CurtinFaces, IIIT-D RGB-D, and KaspAROV datasets which include challenging variations in pose, occlusion, illumination, expression, and time lapse. Our solution achieves average (increased) accuracies of 87.3% (+5.0%), 99.1% (+0.9%), 99.7% (+0.6%) and 95.3%(+0.5%) for the four datasets respectively, thereby improving the state-of-the-art. We also perform additional experiments with thermal images, instead of depth images, showing the high generalization ability of our solution when adopting other modalities for guiding the attention mechanism instead of depth information.
引用
收藏
页码:2461 / 2476
页数:16
相关论文
共 50 条
  • [21] Human-level face verification with intra-personal factor analysis and deep face representation
    Munasinghe, Sarasi
    Fookes, Clinton
    Sridharan, Sridha
    IET BIOMETRICS, 2018, 7 (05) : 467 - 473
  • [22] LFDA: A Framework for Light Field Depth Estimation With Depth Attention
    Kim, Hyeongsik
    Han, Seungjin
    Kim, Youngseop
    IEEE ACCESS, 2024, 12 : 65032 - 65040
  • [23] Face Recognition via Deep Learning and Constraint Sparse Representation
    Zhang J.-W.
    Niu S.-Z.
    Cao Z.-Y.
    Wang X.-Y.
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2019, 39 (03): : 255 - 261
  • [24] Face recognition: Sparse Representation vs. Deep Learning
    Alskeini, Neamah H.
    Kien Nguyen Thanh
    Chandran, Vinod
    Boles, Wageeh
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON GRAPHICS AND SIGNAL PROCESSING (ICGSP 2018), 2018, : 31 - 37
  • [25] Learning to pool high-level features for face representation
    Huang, Renjie
    Ye, Mao
    Xu, Pei
    Li, Tao
    Dou, Yumin
    VISUAL COMPUTER, 2015, 31 (12) : 1683 - 1695
  • [26] Joint learning for face alignment and face transfer with depth image
    Xiaoli Wang
    Yinglin Zheng
    Ming Zeng
    Xuan Cheng
    Wei Lu
    Multimedia Tools and Applications, 2020, 79 : 33993 - 34010
  • [27] Joint learning for face alignment and face transfer with depth image
    Wang, Xiaoli
    Zheng, Yinglin
    Zeng, Ming
    Cheng, Xuan
    Lu, Wei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (45-46) : 33993 - 34010
  • [28] Lightweight Pig Face Feature Learning Evaluation and Application Based on Attention Mechanism and Two-Stage Transfer Learning
    Yin, Zhe
    Peng, Mingkang
    Guo, Zhaodong
    Zhao, Yue
    Li, Yaoyu
    Zhang, Wuping
    Li, Fuzhong
    Guo, Xiaohong
    AGRICULTURE-BASEL, 2024, 14 (01):
  • [29] Capsule Attention for Multimodal EEG-EOG Representation Learning With Application to Driver Vigilance Estimation
    Zhang, Guangyi
    Etemad, Ali
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2021, 29 : 1138 - 1149
  • [30] Attention Augmented Face Morph Detection
    Aghdaie, Poorya
    Soleymani, Sobhan
    Nasrabadi, Nasser M.
    Dawson, Jeremy
    IEEE ACCESS, 2023, 11 : 24281 - 24298