Multimodal Sentiment Analysis on Video Streams using Lightweight Deep Neural Networks

被引:4
|
作者
Yakaew, Atitaya [1 ]
Dailey, Matthew N. [1 ]
Racharak, Teeradaj [2 ]
机构
[1] Asian Inst Technol, Dept Informat & Commun Technol, Klongluang, Pathitimhani, Thailand
[2] Japan Adv Inst Sci & Technol, Sch Informat Sci, Nomi, Ishikawa, Japan
关键词
Deep Learning for Multimodal Real-Time Analysis; Emotion Recognition; Video Processing and Analysis; Lightweight Deep Convolutional Neural Networks; Sentiment Classification; EMOTION RECOGNITION;
D O I
10.5220/0010304404420451
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real-time sentiment analysis on video streams involves classifying a subject's emotional expressions over time based on visual and/or audio information in the data stream. Sentiment can be analyzed using various modalities such as speech, mouth motion, and facial expression. This paper proposes a deep learning approach based on multiple modalities in which extracted features of an audiovisual data stream are fused in real time for sentiment classification. The proposed system comprises four small deep neural network models that analyze visual features and audio features concurrently. We fuse the visual and audio sentiment features into a single stream and accumulate evidence over time using an exponentially-weighted moving average to make a final prediction. Our work provides a promising solution to the problem of building real-time sentiment analysis systems that have constrained software or hardware capabilities. Experiments on the Ryerson audiovideo database of emotional speech (RAVDESS) show that deep audiovisual feature fusion yields substantial improvements over analysis of either single modality. We obtain an accuracy of 90.74%, which is better than baselines of 11.11% - 31.48% on a challenging test dataset.
引用
收藏
页码:442 / 451
页数:10
相关论文
共 50 条
  • [41] Video Summarization Using Deep Neural Networks: A Survey
    Apostolidis, Evlampios
    Adamantidou, Eleni
    Metsai, Alexandros, I
    Mezaris, Vasileios
    Patras, Ioannis
    PROCEEDINGS OF THE IEEE, 2021, 109 (11) : 1838 - 1863
  • [42] Video Deblocking Using Multipath Deep Neural Networks
    Chou, Ping-Peng
    Leou, Jin-Jang
    Communications in Computer and Information Science, 2024, 2075 CCIS : 28 - 39
  • [43] Video Dynamics Detection Using Deep Neural Networks
    Zheng, Keji
    Yan, Wei Qi
    Nand, Parma
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2018, 2 (03): : 224 - 234
  • [44] Multimodal Sentiment Analysis using Deep Learning Fusion Techniques and Transformers
    Bin Habib, Muhaimin
    Hafiz, Md. Ferdous Bin
    Khan, Niaz Ashraf
    Hossain, Sohrab
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (06) : 856 - 863
  • [45] Sentiment Analysis Using Gated Recurrent Neural Networks
    Sachin S.
    Tripathi A.
    Mahajan N.
    Aggarwal S.
    Nagrath P.
    SN Computer Science, 2020, 1 (2)
  • [46] Sentiment analysis on IMDB using lexicon and neural networks
    Shaukat, Zeeshan
    Zulfiqar, Abdul Ahad
    Xiao, Chuangbai
    Azeem, Muhammad
    Mahmood, Tariq
    SN APPLIED SCIENCES, 2020, 2 (02):
  • [47] Sentiment analysis on IMDB using lexicon and neural networks
    Zeeshan Shaukat
    Abdul Ahad Zulfiqar
    Chuangbai Xiao
    Muhammad Azeem
    Tariq Mahmood
    SN Applied Sciences, 2020, 2
  • [48] Neuromorphic Sentiment Analysis Using Spiking Neural Networks
    Chunduri, Raghavendra K.
    Perera, Darshika G.
    SENSORS, 2023, 23 (18)
  • [49] Cyberbullying Detection Neural Networks using Sentiment Analysis
    Atoum, Jalal Omer
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 158 - 164
  • [50] Sentiment Analysis using Neural Networks: A New Approach
    Dhar, Shiv
    Pednekar, Suyog
    Borad, Kishan
    Save, Ashwini
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2018, : 1220 - 1224