KHFA: Knowledge-Driven Hierarchical Feature Alignment Framework for Subject-Invariant Facial Action Unit Detection

被引：0

作者：

Zhao, Huijuan ^{[1
]}

He, Shuangjiang ^{[1
]}

Du, Congju ^{[1
]}

Liu, Linyun ^{[1
]}

Yu, Li ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan 430074, Peoples R China

来源：

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT | 2024年 / 73卷

基金：

中国国家自然科学基金;

关键词：

Gold; Feature extraction; Semantics; Task analysis; Facial muscles; Muscles; Correlation; Contrastive learning (CL); hierarchical feature alignment; hybrid attention mechanism; multilevel knowledge; subject-invariant facial action unit (AU) detection; EXPRESSION;

D O I：

10.1109/TIM.2024.3419103

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Facial action unit (AU) detection constitutes precise measurements for facial appearance variances, holding great significance within the realms of affective computing, human-computer interaction, and negotiation. Subject-invariant AU detection remains a challenge primarily due to the distribution variations among individuals. More importantly, the inherent subtlety and localized nature of facial action frequently give rise to the dominance of interference factors, particularly those related to individual identity. To tackle these issues, we propose a novel knowledge-driven hierarchical feature alignment (KHFA) framework, which aims to investigate the multifaceted consistency within representations of facial actions. AUs unambiguously define facial appearance variations induced by specific groups of facial muscle movements. At the same time, the intrinsic physiological interconnections between these facial muscles impose substantial constraints on the correlations between different AUs. Therefore, KHFA presents a dual classwise alignment scheme to ensure a harmonious balance between consistency within the same class and coherence across different categories. Furthermore, the similarity in the sample-level AU combinations reflects the semantic proximity of global features within the feature space. KHFA integrates an intersample relationship to enhance the coherence of semantic information across samples via a multilabel alignment scheme. Finally, a hybrid attention mechanism equipped with an importance-aware feature fusion layer is proposed to capture nuanced spatial features that are specific to individual AUs and to adeptly embed AU correlations. Extended experiments conducted on two benchmark datasets, BP4D and DISFA, reveal that KHFA outperforms state-of-the-art methods, underscoring the effectiveness and superiority of our approach.

引用

页数：14