A Combined Rule-Based & Machine Learning Audio-Visual Emotion Recognition Approach

被引：52

作者：

Seng, Kah Phooi ^{[1
]}

Ang, Li-Minn ^{[1
]}

Ooi, Chien Shing ^{[2
]}

机构：

[1] Charles Sturt Univ, Sch Comp & Math, Bathurst, NSW 2678, Australia

[2] Sunway Univ, Dept Comp Sci & Networked Syst, Subang Jaya 47500, Malaysia

来源：

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING | 2018年 / 9卷 / 01期

关键词：

Emotion recognition; audio-visual processing; rule-based; machine learning; multimodal system; LINEAR DISCRIMINANT-ANALYSIS; EFFICIENT APPROACH; FACE; FRAMEWORK; FUSION; AUDIO; LDA;

D O I：

10.1109/TAFFC.2016.2588488

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes an audio-visual emotion recognition system that uses a mixture of rule-based and machine learning techniques to improve the recognition efficacy in the audio and video paths. The visual path is designed using the Bi-directional Principal Component Analysis (BDPCA) and Least-Square Linear Discriminant Analysis (LSLDA) for dimensionality reduction and discrimination. The extracted visual features are passed into a newly designed Optimized Kernel-Laplacian Radial Basis Function (OKL-RBF) neural classifier. The audio path is designed using a combination of input prosodic features (pitch, log-energy, zero crossing rates and Teager energy operator) and spectral features (Mel-scale frequency cepstral coefficients). The extracted audio features are passed into an audio feature level fusion module that uses a set of rules to determine the most likely emotion contained in the audio signal. An audio visual fusion module fuses outputs from both paths. The performances of the proposed audio path, visual path, and the final system are evaluated on standard databases. Experiment results and comparisons reveal the good performance of the proposed system.

引用

页码：3 / 13

页数：11

共 50 条

[1] Audio-Visual Learning for Multimodal Emotion Recognition
Fan, Siyu
Jing, Jianan
Wang, Chongwen
SYMMETRY-BASEL, 2025, 17 (03):
[2] An Active Learning Paradigm for Online Audio-Visual Emotion Recognition
Kansizoglou, Ioannis
Bampis, Loukas
Gasteratos, Antonios
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (02) : 756 - 768
[3] Deep Learning Based Audio-Visual Emotion Recognition in a Smart Learning Environment
Ivleva, Natalja
Pentel, Avar
Dunajeva, Olga
Justsenko, Valeria
TOWARDS A HYBRID, FLEXIBLE AND SOCIALLY ENGAGED HIGHER EDUCATION, VOL 1, ICL 2023, 2024, 899 : 420 - 431
[4] Audio-visual spontaneous emotion recognition
Zeng, Zhihong
Hu, Yuxiao
Roisman, Glenn I.
Wen, Zhen
Fu, Yun
Huang, Thomas S.
ARTIFICIAL INTELLIGENCE FOR HUMAN COMPUTING, 2007, 4451 : 72 - +
[5] Deep emotion recognition based on audio-visual correlation
Hajarolasvadi, Noushin
Demirel, Hasan
IET COMPUTER VISION, 2020, 14 (07) : 517 - 527
[6] Emotion Recognition From Audio-Visual Data Using Rule Based Decision Level Fusion
Sahoo, Subhasmita
Routray, Aurobinda
PROCEEDINGS OF THE 2016 IEEE STUDENTS' TECHNOLOGY SYMPOSIUM (TECHSYM), 2016, : 7 - 12
[7] Learning Better Representations for Audio-Visual Emotion Recognition with Common Information
Ma, Fei
Zhang, Wei
Li, Yang
Huang, Shao-Lun
Zhang, Lin
APPLIED SCIENCES-BASEL, 2020, 10 (20): : 1 - 23
[8] Audio-Visual Emotion Recognition Based on Facial Expression and Affective Speech
Zhang, Shiqing
Li, Lemin
Zhao, Zhijin
MULTIMEDIA AND SIGNAL PROCESSING, 2012, 346 : 46 - +
[9] Noisy Speech Recognition Based on Combined Audio-Visual Classifiers
Terissi, Lucas D.
Sad, Gonzalo D.
Gomez, Juan C.
Parodi, Marianela
MULTIMODAL PATTERN RECOGNITION OF SOCIAL SIGNALS IN HUMAN-COMPUTER-INTERACTION, 2015, 8869 : 43 - 53
[10] Learning Affective Features With a Hybrid Deep Model for Audio-Visual Emotion Recognition
Zhang, Shiqing
Zhang, Shiliang
Huang, Tiejun
Gao, Wen
Tian, Qi
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (10) : 3030 - 3043

← 1 2 3 4 5 →