Sparse landmarks for facial action unit detection using vision transformer and perceiver

被引:0
作者
Cakir, Duygu [1 ]
Yilmaz, Gorkem [2 ]
Arica, Nafiz [3 ]
机构
[1] Bahcesehir Univ, Fac Engn & Nat Sci, Dept Software Engn, Istanbul, Turkiye
[2] Bahcesehir Univ, Fac Engn & Nat Sci, Dept Comp Engn, Istanbul, Turkiye
[3] Piri Reis Univ, Fac Engn, Dept Informat Syst Engn, Istanbul, Turkiye
关键词
action unit detection; sparse learning; vision transformer; perceiver; RECOGNITION; PATCHES;
D O I
10.1504/IJCSE.2023.10060451
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The ability to accurately detect facial expressions, represented by facial action units (AUs), holds significant implications across diverse fields such as mental health diagnosis, security, and human-computer interaction. Although earlier approaches have made progress, the burgeoning complexity of facial actions demands more nuanced, computationally efficient techniques. This study pioneers the integration of sparse learning with vision transformer (ViT) and perceiver networks, focusing on the most active and descriptive landmarks for AU detection across both controlled (DISFA, BP4D) and in-the-wild (EmotioNet) datasets. Our novel approach, employing active landmark patches instead of the whole face, not only attains state-of-the-art performance but also uncovers insights into the differing attention mechanisms of ViT and perceiver. This fusion of techniques marks a significant advancement in facial analysis, potentially reshaping strategies in noise reduction and patch optimisation, setting a robust foundation for future research in the domain.
引用
收藏
页码:607 / 620
页数:15
相关论文
共 50 条
  • [21] Efficient deepfake detection using shallow vision transformer
    Shaheen Usmani
    Sunil Kumar
    Debanjan Sadhya
    Multimedia Tools and Applications, 2024, 83 : 12339 - 12362
  • [22] Fire detection using vision transformer on power plant
    Zhang, Kaidi
    Wang, Binjun
    Tong, Xin
    Liu, Keke
    ENERGY REPORTS, 2022, 8 : 657 - 664
  • [23] Multi-modality Empowered Network for Facial Action Unit Detection
    Liu, Peng
    Zhang, Zheng
    Yang, Huiyuan
    Yin, Lijun
    2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 2175 - 2184
  • [24] An Efficient Real-Time Emotion Detection Using Camera and Facial Landmarks
    Nguyen, Binh T.
    Trinh, Minh H.
    Phan, Tan V.
    Nguyen, Hien D.
    2017 SEVENTH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST2017), 2017, : 251 - 255
  • [25] An Intrusion Detection System Using Vision Transformer for Representation Learning
    Ban, Xinbo
    Liu, Ao
    He, Long
    Gong, Li
    FRONTIERS IN CYBER SECURITY, FCS 2023, 2024, 1992 : 531 - 544
  • [26] Improved Deepfake Video Detection Using Convolutional Vision Transformer
    Deressa, Deressa Wodajo
    Lambert, Peter
    Van Wallendael, Glenn
    Atnafu, Solomon
    Mareen, Hannes
    2024 IEEE GAMING, ENTERTAINMENT, AND MEDIA CONFERENCE, GEM 2024, 2024, : 492 - 497
  • [27] Explainable Anomaly Detection Using Vision Transformer Based SVDD
    Baek, Ji-Won
    Chung, Kyungyong
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (03): : 6573 - 6586
  • [28] Action unit detection in 3D facial videos with application in facial expression retrieval and recognition
    Danelakis, Antonios
    Theoharis, Theoharis
    Pratikakis, Ioannis
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (19) : 24813 - 24841
  • [29] Action unit detection in 3D facial videos with application in facial expression retrieval and recognition
    Antonios Danelakis
    Theoharis Theoharis
    Ioannis Pratikakis
    Multimedia Tools and Applications, 2018, 77 : 24813 - 24841
  • [30] An efficient approach for facial action unit intensity detection using distance metric learning based on cosine similarity
    Rathee, Neeru
    Ganotra, Dinesh
    SIGNAL IMAGE AND VIDEO PROCESSING, 2018, 12 (06) : 1141 - 1148