Sparse landmarks for facial action unit detection using vision transformer and perceiver

被引：0

作者：

Cakir, Duygu ^{[1
]}

Yilmaz, Gorkem ^{[2
]}

Arica, Nafiz ^{[3
]}

机构：

[1] Bahcesehir Univ, Fac Engn & Nat Sci, Dept Software Engn, Istanbul, Turkiye

[2] Bahcesehir Univ, Fac Engn & Nat Sci, Dept Comp Engn, Istanbul, Turkiye

[3] Piri Reis Univ, Fac Engn, Dept Informat Syst Engn, Istanbul, Turkiye

来源：

INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING | 2024年 / 27卷 / 05期

关键词：

action unit detection; sparse learning; vision transformer; perceiver; RECOGNITION; PATCHES;

D O I：

10.1504/IJCSE.2023.10060451

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The ability to accurately detect facial expressions, represented by facial action units (AUs), holds significant implications across diverse fields such as mental health diagnosis, security, and human-computer interaction. Although earlier approaches have made progress, the burgeoning complexity of facial actions demands more nuanced, computationally efficient techniques. This study pioneers the integration of sparse learning with vision transformer (ViT) and perceiver networks, focusing on the most active and descriptive landmarks for AU detection across both controlled (DISFA, BP4D) and in-the-wild (EmotioNet) datasets. Our novel approach, employing active landmark patches instead of the whole face, not only attains state-of-the-art performance but also uncovers insights into the differing attention mechanisms of ViT and perceiver. This fusion of techniques marks a significant advancement in facial analysis, potentially reshaping strategies in noise reduction and patch optimisation, setting a robust foundation for future research in the domain.

引用

页码：607 / 620

页数：15

共 50 条

[1] Enhanced facial action unit detection with adaptable patch sizes on representative landmarks
Duygu Cakir
Gorkem Yilmaz
Nafiz Arica
Neural Computing and Applications, 2025, 37 (5) : 3777 - 3791
[2] Cascading CNNs for facial action unit detection
Cakir, Duygu
Arica, Nafiz
ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2023, 47
[3] Facial Action Unit Detection Using Kernel Partial Least Squares
Gehrig, Tobias
Ekenel, Hazim Kemal
2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
[4] Facial micro-expression recognition using three-stream vision transformer network with sparse sampling and relabeling
Zhang, He
Yin, Lu
Zhang, Hanling
Wu, Xuesong
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3761 - 3771
[5] Facial micro-expression recognition using three-stream vision transformer network with sparse sampling and relabeling
He Zhang
Lu Yin
Hanling Zhang
Xuesong Wu
Signal, Image and Video Processing, 2024, 18 : 3761 - 3771
[6] Stacking multiple cues for facial action unit detection
Akay, Simge
Arica, Nafiz
VISUAL COMPUTER, 2022, 38 (12): : 4235 - 4250
[7] Enhanced Facial Emotion Recognition Using Vision Transformer Models
Fatima, N. Sabiyath
Deepika, G.
Anthonisamy, Arun
Chitra, R. Jothi
Muralidharan, J.
Alagarsamy, Manjunathan
Ramyasree, Kummari
JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY, 2025, 20 (02) : 1143 - 1152
[8] Driver Drowsiness Detection Using Vision Transformer
Azmi, Muhammad Muizuddin Bin Mohamad
Zaman, Fadhlan Hafizhelmi Kamaru
2024 IEEE 14TH SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS, ISCAIE 2024, 2024, : 329 - 336
[9] Fall Event Detection using Vision Transformer
Dey, Ankita
Rajan, Sreeraman
Xiao, George
Lu, Jianping
2022 IEEE SENSORS, 2022,
[10] Pupil Detection Using Hybrid Vision Transformer
Wang, Li
Wang, Changyuan
Zhang, Yu
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (12)

← 1 2 3 4 5 →