Learning spatial-temporal deformable networks for unconstrained face alignment and tracking in videos

被引:12
作者
Zhu, Hongyu [1 ]
Liu, Hao [1 ,2 ]
Zhu, Congcong [1 ,3 ]
Deng, Zongyong [1 ]
Sun, Xuehong [1 ,2 ]
机构
[1] Ningxia Univ, Sch Informat Engn, Yinchuan 750021, Ningxia, Peoples R China
[2] Collaborat Innovat Ctr Ningxia Big Data & Artific, Yinchuan 750021, Ningxia, Peoples R China
[3] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
基金
美国国家科学基金会;
关键词
Face alignment; Face tracking; Spatial transformer; Relational reasoning; Video analysis; Biometrics; IMAGE;
D O I
10.1016/j.patcog.2020.107354
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a spatial-temporal deformable networks approach to investigate both problems of face alignment in static images and face tracking in videos under unconstrained environments. Unlike conventional feature extractions which cannot explicitly exploit augmented spatial geometry for various facial shapes, in our approach, we propose a deformable hourglass networks (DHGN) method, which aims to learn a deformable mask to reduce the variances of facial deformation and extract attentional facial regions for robust feature representation. However, our DHGN is limited to extract only spatial appearance features from static facial images, which cannot explicitly exploit the temporal consistency information across consecutive frames in videos. For efficient temporal modeling, we further extend our DHGN to a temporal DHGN (T-DHGN) paradigm particularly for video-based face alignment. To this end, our T-DHGN principally incorporates with a temporal relational reasoning module, so that the temporal order relationship among frames is encoded in the relational feature. By doing this, our T-DHGN reasons about the temporal offsets to select a subset of discriminative frames over time steps, thus allowing temporal consistency information memorized to flow across frames for stable landmark tracking in videos. Compared with most state-of-the-art methods, our approach achieves superior performance on folds of widely-evaluated benchmarking datasets. Code will be made publicly available upon publication. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:12
相关论文
共 89 条
  • [41] Disentangling 3D Pose in A Dendritic CNN for Unconstrained 2D Face Alignment
    Kumar, Amit
    Chellappa, Rama
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 430 - 439
  • [42] Lazebnik S., COMPUTER VISION PATT, V2, P2169
  • [43] Le V, 2012, LECT NOTES COMPUT SC, V7574, P679, DOI 10.1007/978-3-642-33712-3_49
  • [44] Temporal Deformable Residual Networks for Action Segmentation in Videos
    Lei, Peng
    Todorovic, Sinisa
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6742 - 6751
  • [45] Joint Face Alignment and 3D Face Reconstruction
    Liu, Feng
    Zeng, Dan
    Zhao, Qijun
    Liu, Xiaoming
    [J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 545 - 560
  • [46] Learning Reasoning-Decision Networks for Robust Face Alignment
    Liu, Hao
    Lu, Jiwen
    Guo, Minghao
    Wu, Suping
    Zhou, Jie
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (03) : 679 - 693
  • [47] Two-Stream Transformer Networks for Video-Based Face Alignment
    Liu, Hao
    Lu, Jiwen
    Feng, Jianjiang
    Zhou, Jie
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (11) : 2546 - 2554
  • [48] Learning Deep Sharable and Structural Detectors for Face Alignment
    Liu, Hao
    Lu, Jiwen
    Feng, Jianjiang
    Zhou, Jie
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (04) : 1666 - 1678
  • [49] Lowe D.G., 1999, P 7 IEEE INT C COMP, P1150
  • [50] A Deep Regression Architecture with Two-Stage Re-initialization for High Performance Facial Landmark Detection
    Lv, Jiangjing
    Shao, Xiaohu
    Xing, Junliang
    Cheng, Cheng
    Zhou, Xi
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3691 - 3700