Sparse landmarks for facial action unit detection using vision transformer and perceiver

被引:0
|
作者
Cakir, Duygu [1 ]
Yilmaz, Gorkem [2 ]
Arica, Nafiz [3 ]
机构
[1] Bahcesehir Univ, Fac Engn & Nat Sci, Dept Software Engn, Istanbul, Turkiye
[2] Bahcesehir Univ, Fac Engn & Nat Sci, Dept Comp Engn, Istanbul, Turkiye
[3] Piri Reis Univ, Fac Engn, Dept Informat Syst Engn, Istanbul, Turkiye
关键词
action unit detection; sparse learning; vision transformer; perceiver; RECOGNITION; PATCHES;
D O I
10.1504/IJCSE.2023.10060451
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The ability to accurately detect facial expressions, represented by facial action units (AUs), holds significant implications across diverse fields such as mental health diagnosis, security, and human-computer interaction. Although earlier approaches have made progress, the burgeoning complexity of facial actions demands more nuanced, computationally efficient techniques. This study pioneers the integration of sparse learning with vision transformer (ViT) and perceiver networks, focusing on the most active and descriptive landmarks for AU detection across both controlled (DISFA, BP4D) and in-the-wild (EmotioNet) datasets. Our novel approach, employing active landmark patches instead of the whole face, not only attains state-of-the-art performance but also uncovers insights into the differing attention mechanisms of ViT and perceiver. This fusion of techniques marks a significant advancement in facial analysis, potentially reshaping strategies in noise reduction and patch optimisation, setting a robust foundation for future research in the domain.
引用
收藏
页码:607 / 620
页数:15
相关论文
共 50 条
  • [1] Enhanced facial action unit detection with adaptable patch sizes on representative landmarks
    Duygu Cakir
    Gorkem Yilmaz
    Nafiz Arica
    Neural Computing and Applications, 2025, 37 (5) : 3777 - 3791
  • [2] Cascading CNNs for facial action unit detection
    Cakir, Duygu
    Arica, Nafiz
    ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2023, 47
  • [3] Facial Action Unit Detection Using Kernel Partial Least Squares
    Gehrig, Tobias
    Ekenel, Hazim Kemal
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [4] Facial micro-expression recognition using three-stream vision transformer network with sparse sampling and relabeling
    Zhang, He
    Yin, Lu
    Zhang, Hanling
    Wu, Xuesong
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3761 - 3771
  • [5] Facial micro-expression recognition using three-stream vision transformer network with sparse sampling and relabeling
    He Zhang
    Lu Yin
    Hanling Zhang
    Xuesong Wu
    Signal, Image and Video Processing, 2024, 18 : 3761 - 3771
  • [6] Stacking multiple cues for facial action unit detection
    Akay, Simge
    Arica, Nafiz
    VISUAL COMPUTER, 2022, 38 (12): : 4235 - 4250
  • [7] Enhanced Facial Emotion Recognition Using Vision Transformer Models
    Fatima, N. Sabiyath
    Deepika, G.
    Anthonisamy, Arun
    Chitra, R. Jothi
    Muralidharan, J.
    Alagarsamy, Manjunathan
    Ramyasree, Kummari
    JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY, 2025, 20 (02) : 1143 - 1152
  • [8] Driver Drowsiness Detection Using Vision Transformer
    Azmi, Muhammad Muizuddin Bin Mohamad
    Zaman, Fadhlan Hafizhelmi Kamaru
    2024 IEEE 14TH SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS, ISCAIE 2024, 2024, : 329 - 336
  • [9] Fall Event Detection using Vision Transformer
    Dey, Ankita
    Rajan, Sreeraman
    Xiao, George
    Lu, Jianping
    2022 IEEE SENSORS, 2022,
  • [10] Pupil Detection Using Hybrid Vision Transformer
    Wang, Li
    Wang, Changyuan
    Zhang, Yu
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (12)