Depth as Attention for Face Representation Learning

被引:16
|
作者
Uppal, Hardik [1 ]
Sepas-Moghaddam, Alireza [1 ]
Greenspan, Michael [1 ]
Etemad, Ali [1 ]
机构
[1] Queens Univ, Dept Elect & Comp Engn, Kingston, ON K7L 3N6, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Face recognition; Feature extraction; Deep learning; Training; Lighting; Support vector machines; Resource description framework; RGB-D face recognition; depth-guided features; attention; multimodal deep network; VISUAL-ATTENTION; RECOGNITION; IDENTIFICATION; FEATURES; DATABASE;
D O I
10.1109/TIFS.2021.3053458
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Face representation learning solutions have recently achieved great success for various applications such as verification and identification. However, face recognition approaches that are based purely on RGB images rely solely on intensity information, and therefore are more sensitive to facial variations, notably pose, occlusions, and environmental changes such as illumination and background. A novel depth-guided attention mechanism is proposed for deep multi-modal face recognition using low-cost RGB-D sensors. Our novel attention mechanism directs the deep network "where to look" for visual features in the RGB image by focusing the attention of the network using depth features extracted by a Convolution Neural Network (CNN). The depth features help the network focus on regions of the face in the RGB image that contain more prominent person-specific information. Our attention mechanism then uses this correlation to generate an attention map for RGB images from the depth features extracted by the CNN. We test our network on four public datasets, showing that the features obtained by our proposed solution yield better results on the Lock3DFace, CurtinFaces, IIIT-D RGB-D, and KaspAROV datasets which include challenging variations in pose, occlusion, illumination, expression, and time lapse. Our solution achieves average (increased) accuracies of 87.3% (+5.0%), 99.1% (+0.9%), 99.7% (+0.6%) and 95.3%(+0.5%) for the four datasets respectively, thereby improving the state-of-the-art. We also perform additional experiments with thermal images, instead of depth images, showing the high generalization ability of our solution when adopting other modalities for guiding the attention mechanism instead of depth information.
引用
收藏
页码:2461 / 2476
页数:16
相关论文
共 50 条
  • [1] Vision-language representation learning with breadth and depth attention
    Liu, Yun
    Zhang, Bo
    Wang, Chencheng
    Yan, Genglong
    Zhou, Ke
    Li, Zhoujun
    Zhang, Leilei
    KNOWLEDGE-BASED SYSTEMS, 2025, 310
  • [2] Face detection using representation learning
    Zhan, Shu
    Tao, Qin-Qin
    Li, Xiao-Hong
    NEUROCOMPUTING, 2016, 187 : 19 - 26
  • [3] Discriminative Frontal Face Synthesis by Using Attention and Metric Learning
    Cevikalp, Hakan
    Turgut, Kaya
    Topal, Cihan
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2025, : 809 - 820
  • [4] Prototype Memory for Large-Scale Face Representation Learning
    Smirnov, Evgeny
    Garaev, Nikita
    Galyuk, Vasiliy
    Lukyanets, Evgeny
    IEEE ACCESS, 2022, 10 : 12031 - 12046
  • [5] Deep Learning Based Face Recognition with Sparse Representation Classification
    Cheng, Eric-Juwei
    Prasad, Mukesh
    Puthal, Deepak
    Sharma, Nabin
    Prasad, Om Kumar
    Chin, Po-Hao
    Lin, Chin-Teng
    Blumenstein, Michael
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 665 - 674
  • [6] Attention Consistency Refined Masked Frequency Forgery Representation for Generalizing Face Forgery Detection
    Liu, Decheng
    Chen, Tao
    Peng, Chunlei
    Wang, Nannan
    Hu, Ruimin
    Gao, Xinbo
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 504 - 515
  • [7] Domain Discrepancy Elimination and Mean Face Representation Learning for NIR-VIS Face Recognition
    Hu, Weipeng
    Hu, Haifeng
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 2068 - 2072
  • [8] Learning hierarchical face representation to enhance HCI among medical robots
    Sun, Dianmin
    Zhao, Honghua
    Song, Tao
    Liu, Aiqin
    Cheng, Jinling
    Liu, Zhi
    Zhao, Xin
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 118 : 180 - 186
  • [9] A Novel Sparse Representation Classification Face Recognition Based on Deep Learning
    Zeng, Junying
    Zhai, Yikui
    Gan, Junying
    IEEE 12TH INT CONF UBIQUITOUS INTELLIGENCE & COMP/IEEE 12TH INT CONF ADV & TRUSTED COMP/IEEE 15TH INT CONF SCALABLE COMP & COMMUN/IEEE INT CONF CLOUD & BIG DATA COMP/IEEE INT CONF INTERNET PEOPLE AND ASSOCIATED SYMPOSIA/WORKSHOPS, 2015, : 1520 - 1523
  • [10] Transfer Learning of Structured Representation for Face Recognition
    Ren, Chuan-Xian
    Dai, Dao-Qing
    Huang, Ke-Kun
    Lai, Zhao-Rong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (12) : 5440 - 5454