Depth as Attention for Face Representation Learning

被引：16

作者：

Uppal, Hardik ^{[1
]}

Sepas-Moghaddam, Alireza ^{[1
]}

Greenspan, Michael ^{[1
]}

Etemad, Ali ^{[1
]}

机构：

[1] Queens Univ, Dept Elect & Comp Engn, Kingston, ON K7L 3N6, Canada

来源：

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY | 2021年 / 16卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Face recognition; Feature extraction; Deep learning; Training; Lighting; Support vector machines; Resource description framework; RGB-D face recognition; depth-guided features; attention; multimodal deep network; VISUAL-ATTENTION; RECOGNITION; IDENTIFICATION; FEATURES; DATABASE;

D O I：

10.1109/TIFS.2021.3053458

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Face representation learning solutions have recently achieved great success for various applications such as verification and identification. However, face recognition approaches that are based purely on RGB images rely solely on intensity information, and therefore are more sensitive to facial variations, notably pose, occlusions, and environmental changes such as illumination and background. A novel depth-guided attention mechanism is proposed for deep multi-modal face recognition using low-cost RGB-D sensors. Our novel attention mechanism directs the deep network "where to look" for visual features in the RGB image by focusing the attention of the network using depth features extracted by a Convolution Neural Network (CNN). The depth features help the network focus on regions of the face in the RGB image that contain more prominent person-specific information. Our attention mechanism then uses this correlation to generate an attention map for RGB images from the depth features extracted by the CNN. We test our network on four public datasets, showing that the features obtained by our proposed solution yield better results on the Lock3DFace, CurtinFaces, IIIT-D RGB-D, and KaspAROV datasets which include challenging variations in pose, occlusion, illumination, expression, and time lapse. Our solution achieves average (increased) accuracies of 87.3% (+5.0%), 99.1% (+0.9%), 99.7% (+0.6%) and 95.3%(+0.5%) for the four datasets respectively, thereby improving the state-of-the-art. We also perform additional experiments with thermal images, instead of depth images, showing the high generalization ability of our solution when adopting other modalities for guiding the attention mechanism instead of depth information.

引用

页码：2461 / 2476

页数：16

共 50 条

[1] Vision-language representation learning with breadth and depth attention
Liu, Yun
Zhang, Bo
Wang, Chencheng
Yan, Genglong
Zhou, Ke
Li, Zhoujun
Zhang, Leilei
KNOWLEDGE-BASED SYSTEMS, 2025, 310
[2] Face detection using representation learning
Zhan, Shu
Tao, Qin-Qin
Li, Xiao-Hong
NEUROCOMPUTING, 2016, 187 : 19 - 26
[3] Discriminative Frontal Face Synthesis by Using Attention and Metric Learning
Cevikalp, Hakan
Turgut, Kaya
Topal, Cihan
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2025, : 809 - 820
[4] Prototype Memory for Large-Scale Face Representation Learning
Smirnov, Evgeny
Garaev, Nikita
Galyuk, Vasiliy
Lukyanets, Evgeny
IEEE ACCESS, 2022, 10 : 12031 - 12046
[5] Deep Learning Based Face Recognition with Sparse Representation Classification
Cheng, Eric-Juwei
Prasad, Mukesh
Puthal, Deepak
Sharma, Nabin
Prasad, Om Kumar
Chin, Po-Hao
Lin, Chin-Teng
Blumenstein, Michael
NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 665 - 674
[6] Attention Consistency Refined Masked Frequency Forgery Representation for Generalizing Face Forgery Detection
Liu, Decheng
Chen, Tao
Peng, Chunlei
Wang, Nannan
Hu, Ruimin
Gao, Xinbo
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 504 - 515
[7] Domain Discrepancy Elimination and Mean Face Representation Learning for NIR-VIS Face Recognition
Hu, Weipeng
Hu, Haifeng
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 2068 - 2072
[8] Learning hierarchical face representation to enhance HCI among medical robots
Sun, Dianmin
Zhao, Honghua
Song, Tao
Liu, Aiqin
Cheng, Jinling
Liu, Zhi
Zhao, Xin
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 118 : 180 - 186
[9] A Novel Sparse Representation Classification Face Recognition Based on Deep Learning
Zeng, Junying
Zhai, Yikui
Gan, Junying
IEEE 12TH INT CONF UBIQUITOUS INTELLIGENCE & COMP/IEEE 12TH INT CONF ADV & TRUSTED COMP/IEEE 15TH INT CONF SCALABLE COMP & COMMUN/IEEE INT CONF CLOUD & BIG DATA COMP/IEEE INT CONF INTERNET PEOPLE AND ASSOCIATED SYMPOSIA/WORKSHOPS, 2015, : 1520 - 1523
[10] Transfer Learning of Structured Representation for Face Recognition
Ren, Chuan-Xian
Dai, Dao-Qing
Huang, Ke-Kun
Lai, Zhao-Rong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (12) : 5440 - 5454

← 1 2 3 4 5 →