Spatial-Temporal Knowledge Integration: Robust Self-Supervised Facial Landmark Tracking

被引:5
作者
Zhu, Congcong [1 ]
Li, Xiaoqiang [1 ,2 ]
Li, Jide [1 ]
Ding, Guangtai [1 ]
Tong, Weiqin [1 ,2 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai, Peoples R China
[2] Shanghai Univ, Shanghai Inst Adv Commun & Data Sci, Shanghai, Peoples R China
来源
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA | 2020年
关键词
Face Tracking; Self-supervised Learning; Knowledge Distillation; FACE ALIGNMENT;
D O I
10.1145/3394171.3413993
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diversity of training data significantly affects tracking robustness of model under unconstrained environments. However, existing labeled datasets for facial landmark tracking tend to be large but not diverse, and manually annotating the massive clips of new diverse videos is extremely expensive. To address these problems, we propose a Spatial-Temporal Knowledge Integration (STKI) approach. Unlike most existing methods which rely heavily on labeled data, STKI exploits supervisions from unlabeled data. Specifically, STKI integrates spatial-temporal knowledge from massive unlabeled videos, which has several orders of magnitude more than existing labeled video data on the diversity, for robust tracking. Our framework includes a self-supervised tracker and an image-based detector for tracking initialization. To avoid the distortion of facial shape, the tracker leverages adversarial learning to introduce facial structure prior and temporal knowledge into cycle-consistency tracking. Meanwhile, we design a graph-based knowledge distillation method, which distills the knowledge from tracking and detection results, to improve the generalization of the detector. The fine-tuned detector can provide tracker on unconstrained videos with high-quality tracking initialization. Extensive experimental results show that the proposed method achieves state-of-the-art performance on comprehensive evaluation datasets.
引用
收藏
页码:4135 / 4143
页数:9
相关论文
共 50 条
  • [21] SSAT: Self-Supervised Associating Network for Multiobject Tracking
    Chung, Tae-Young
    Cho, MyeongAh
    Lee, Heansung
    Lee, Sangyoun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (11) : 7858 - 7868
  • [22] ViewMix: Augmentation for Robust Representation in Self-Supervised Learning
    Das, Arjon
    Zhong, Xin
    IEEE ACCESS, 2024, 12 : 8461 - 8470
  • [23] Self-supervised Learning for Robust Surface Defect Detection
    Aqeel, Muhammad
    Sharifi, Shakiba
    Cristani, Marco
    Setti, Francesco
    DEEP LEARNING THEORY AND APPLICATIONS, PT II, DELTA 2024, 2024, 2172 : 164 - 177
  • [24] Self-supervised attention flow for dialogue state tracking
    Pan, Boyuan
    Yang, Yazheng
    Li, Bo
    Cai, Deng
    NEUROCOMPUTING, 2021, 440 : 279 - 286
  • [25] Robust Inverse Framework using Knowledge-guided Self-Supervised Learning: An application to Hydrology
    Ghosh, Rahul
    Renganathan, Arvind
    Tayal, Kshitij
    Li, Xiang
    Khandelwal, Ankush
    Jia, Xiaowei
    Duffy, Christopher
    Nieber, John
    Kumar, Vipin
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 465 - 474
  • [26] Self-supervised temporal autoencoder for egocentric action segmentation
    Zhang, Mingming
    Liu, Dong
    Hu, Shizhe
    Yan, Xiaoqiang
    Sun, Zhongchuan
    Ye, Yangdong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [27] A Novel Knowledge Distillation Method for Self-Supervised Hyperspectral Image Classification
    Chi, Qiang
    Lv, Guohua
    Zhao, Guixin
    Dong, Xiangjun
    REMOTE SENSING, 2022, 14 (18)
  • [28] Image quality assessment based on self-supervised learning and knowledge distillation
    Sang, Qingbing
    Shu, Ziru
    Liu, Lixiong
    Hu, Cong
    Wu, Qin
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
  • [29] Occluded Facial Expression Recognition Using Self-supervised Learning
    Wang, Jiahe
    Ding, Heyan
    Wang, Shangfei
    COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 121 - 136
  • [30] Self-supervised extracted contrast network for facial expression recognition
    Yan L.
    Yang J.
    Xia J.
    Gao R.
    Zhang L.
    Wan J.
    Tang Y.
    Multimedia Tools and Applications, 2025, 84 (15) : 14977 - 14996