A Combined Rule-Based & Machine Learning Audio-Visual Emotion Recognition Approach

被引:52
|
作者
Seng, Kah Phooi [1 ]
Ang, Li-Minn [1 ]
Ooi, Chien Shing [2 ]
机构
[1] Charles Sturt Univ, Sch Comp & Math, Bathurst, NSW 2678, Australia
[2] Sunway Univ, Dept Comp Sci & Networked Syst, Subang Jaya 47500, Malaysia
关键词
Emotion recognition; audio-visual processing; rule-based; machine learning; multimodal system; LINEAR DISCRIMINANT-ANALYSIS; EFFICIENT APPROACH; FACE; FRAMEWORK; FUSION; AUDIO; LDA;
D O I
10.1109/TAFFC.2016.2588488
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an audio-visual emotion recognition system that uses a mixture of rule-based and machine learning techniques to improve the recognition efficacy in the audio and video paths. The visual path is designed using the Bi-directional Principal Component Analysis (BDPCA) and Least-Square Linear Discriminant Analysis (LSLDA) for dimensionality reduction and discrimination. The extracted visual features are passed into a newly designed Optimized Kernel-Laplacian Radial Basis Function (OKL-RBF) neural classifier. The audio path is designed using a combination of input prosodic features (pitch, log-energy, zero crossing rates and Teager energy operator) and spectral features (Mel-scale frequency cepstral coefficients). The extracted audio features are passed into an audio feature level fusion module that uses a set of rules to determine the most likely emotion contained in the audio signal. An audio visual fusion module fuses outputs from both paths. The performances of the proposed audio path, visual path, and the final system are evaluated on standard databases. Experiment results and comparisons reveal the good performance of the proposed system.
引用
收藏
页码:3 / 13
页数:11
相关论文
共 50 条
  • [41] Audio-Visual Emotion Recognition Using Big Data Towards 5G
    M. Shamim Hossain
    Ghulam Muhammad
    Mohammed F. Alhamid
    Biao Song
    Khaled Al-Mutib
    Mobile Networks and Applications, 2016, 21 : 753 - 763
  • [42] Audio-visual emotion recognition using multi-directional regression and Ridgelet transform
    Hossain, M. Shamim
    Muhammad, Ghulam
    JOURNAL ON MULTIMODAL USER INTERFACES, 2016, 10 (04) : 325 - 333
  • [43] Audio-Visual Emotion Recognition Using Big Data Towards 5G
    Hossain, M. Shamim
    Muhammad, Ghulam
    Alhamid, Mohammed F.
    Song, Biao
    Al-Mutib, Khaled
    MOBILE NETWORKS & APPLICATIONS, 2016, 21 (05) : 753 - 763
  • [44] Audio-visual expression-based emotion recognition model for neglected people in real-time: a late-fusion approach
    Sirshendu Hore
    Tanmay Bhattacharya
    Multimedia Tools and Applications, 2025, 84 (15) : 14623 - 14661
  • [45] Combination of Heuristic, Rule-Based and Machine Learning for Bibliography Extraction
    Suryawati, Endang
    Widyantoro, Dwi H.
    PROCEEDINGS OF 2017 5TH INTERNATIONAL CONFERENCE ON INSTRUMENTATION, COMMUNICATIONS, INFORMATION TECHNOLOGY, AND BIOMEDICAL ENGINEERING (ICICI-BME): SCIENCE AND TECHNOLOGY FOR A BETTER LIFE, 2017, : 276 - 281
  • [46] Audio-visual emotion recognition using multi-directional regression and Ridgelet transform
    M. Shamim Hossain
    Ghulam Muhammad
    Journal on Multimodal User Interfaces, 2016, 10 : 325 - 333
  • [47] Visual -audio emotion recognition based on multi -task and ensemble learning with multiple features ?
    Hao, Man
    Cao, Wei-Hua
    Liu, Zhen-Tao
    Wu, Min
    Xiao, Peng
    NEUROCOMPUTING, 2020, 391 : 42 - 51
  • [48] EEG-Based Multimodal Emotion Recognition: A Machine Learning Perspective
    Liu, Huan
    Lou, Tianyu
    Zhang, Yuzhe
    Wu, Yixiao
    Xiao, Yang
    Jensen, Christian S.
    Zhang, Dalin
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 29
  • [49] Empirical Study of Audio-Visual Features Fusion for Gait Recognition
    Castro, Francisco M.
    Marin-Jimenez, Manuel J.
    Guil, Nicolas
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2015, PT I, 2015, 9256 : 727 - 739
  • [50] Machine Learning Approach for Emotion Recognition in Speech
    Gjoreski, Martin
    Gjoreski, Hristijan
    Kulakov, Andrea
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2014, 38 (04): : 377 - 383