Exploiting enhanced and robust RGB-D face representation via progressive multi-modal learning

被引:1
|
作者
Zhu, Yizhe [1 ,2 ]
Gao, Jialin [1 ,2 ]
Wu, Tianshu [2 ]
Liu, Qiong [2 ]
Zhou, Xi [2 ]
机构
[1] Shanghai Jiao Tong Univ, Cooperat Medianet Innovat Ctr, Shanghai 200240, Peoples R China
[2] CloudWalk Technol, Shanghai 201203, Peoples R China
关键词
RGB-D face recognition; Multi-modal fusion; Depth enhancement; Multi-head-attention mechanism; Incomplete modal data; ATTENTION;
D O I
10.1016/j.patrec.2022.12.027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing RGB-based 2D face recognition approaches are sensitive to facial variations, posture, occlusions, and illumination. Current depth-based methods have been proved to alleviate the above sensitivity by introducing geometric information but rely heavily on high-quality depth from high-cost RGB-D cameras. To this end, we propose a Progressive Multi-modal Fusion framework to exploit enhanced and robust face representation for RGB-D facial recognition based on low-cost RGB-D cameras, which also deals with in-complete RGB-D modal data. Due to the defects such as holes caused by low-cost cameras, we first design a depth enhancement module to refine the low-quality depth and correct depth inaccuracies. Then, we extract and aggregate augmented feature maps of RGB and depth modality step-by-step. Subsequently, the masked modeling scheme and iterative inter-modal feature interaction module aim to fully exploit the implicit relations among these two modalities. We perform comprehensive experiments to verify the superior performance and robustness of the proposed solution over other FR approaches on four chal-lenging benchmark databases. (c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:38 / 45
页数:8
相关论文
共 26 条
  • [1] Exploiting Multi-modal Fusion for Robust Face Representation Learning with Missing Modality
    Zhu, Yizhe
    Sun, Xin
    Zhou, Xi
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II, 2023, 14255 : 283 - 294
  • [2] MULTI-MODAL TRANSFORMER FOR RGB-D SALIENT OBJECT DETECTION
    Song, Peipei
    Zhang, Jing
    Koniusz, Piotr
    Barnes, Nick
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2466 - 2470
  • [3] Cross-Level Multi-Modal Features Learning With Transformer for RGB-D Object Recognition
    Zhang, Ying
    Yin, Maoliang
    Wang, Heyong
    Hua, Changchun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7121 - 7130
  • [4] MAPNet: Multi-modal attentive pooling network for RGB-D indoor scene classification
    Li, Yabei
    Zhang, Zhang
    Cheng, Yanhua
    Wang, Liang
    Tan, Tieniu
    PATTERN RECOGNITION, 2019, 90 : 436 - 449
  • [5] RGB-D Face Recognition via Deep Complementary and Common Feature Learning
    Zhang, Hao
    Han, Hu
    Cui, Jiyun
    Shan, Shiguang
    Chen, Xilin
    PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 8 - 15
  • [6] Multi-modal Representation Learning for Social Post Location Inference
    Dai, RuiTing
    Luo, Jiayi
    Luo, Xucheng
    Mo, Lisi
    Ma, Wanlun
    Zhou, Fan
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 6331 - 6336
  • [7] Tripartite interaction representation learning for multi-modal sentiment analysis
    Wang, Binqiang
    Dong, Gang
    Zhao, Yaqian
    Li, Rengang
    Yin, Wenfeng
    Lu, Lihua
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 268
  • [8] Discriminant Patch Representation for RGB-D Face Recognition using Convolutional Neural Networks
    Grati, Nesrine
    Ben-Hamadou, Achraf
    Hammami, Mohamed
    PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 510 - 516
  • [9] Robust 3D Semantic Segmentation Method Based on Multi-Modal Collaborative Learning
    Ni, Peizhou
    Li, Xu
    Xu, Wang
    Zhou, Xiaojing
    Jiang, Tao
    Hu, Weiming
    REMOTE SENSING, 2024, 16 (03)
  • [10] Two-Level Attention-based Fusion Learning for RGB-D Face Recognition
    Uppal, Hardik
    Sepas-Moghaddam, Alireza
    Greenspan, Michael
    Etemad, Ali
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 10120 - 10127