A Review of Key Technologies for Emotion Analysis Using Multimodal Information

被引:16
作者
Zhu, Xianxun [1 ]
Guo, Chaopeng [1 ]
Feng, Heyang [1 ]
Huang, Yao [1 ]
Feng, Yichen [1 ]
Wang, Xiangyang [1 ]
Wang, Rui [1 ]
机构
[1] Shanghai Univ, Sch Commun & Informat Engn, 99 Shangda Rd, Shanghai 200444, Peoples R China
基金
中国国家自然科学基金;
关键词
Multimodal information; Emotional analysis; Multimodal fusion; ACUTE CORONARY SYNDROMES; SENTIMENT; FUSION; RECOGNITION; EXPRESSIONS; ATTENTION; STRENGTH; TRIGGERS; DATABASE; MODEL;
D O I
10.1007/s12559-024-10287-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion analysis, an integral aspect of human-machine interactions, has witnessed significant advancements in recent years. With the rise of multimodal data sources such as speech, text, and images, there is a profound need for a comprehensive review of pivotal elements within this domain. Our paper delves deep into the realm of emotion analysis, examining multimodal data sources encompassing speech, text, images, and physiological signals. We provide a curated overview of relevant literature, academic forums, and competitions. Emphasis is laid on dissecting unimodal processing methods, including preprocessing, feature extraction, and tools across speech, text, images, and physiological signals. We further discuss the nuances of multimodal data fusion techniques, spotlighting early, late, model, and hybrid fusion strategies. Key findings indicate the essentiality of analyzing emotions across multiple modalities. Detailed discussions on emotion elicitation, expression, and representation models are presented. Moreover, we uncover challenges such as dataset creation, modality synchronization, model efficiency, limited data scenarios, cross-domain applicability, and the handling of missing modalities. Practical solutions and suggestions are provided to address these challenges. The realm of multimodal emotion analysis is vast, with numerous applications ranging from driver sentiment detection to medical evaluations. Our comprehensive review serves as a valuable resource for both scholars and industry professionals. It not only sheds light on the current state of research but also highlights potential directions for future innovations. The insights garnered from this paper are expected to pave the way for subsequent advancements in deep multimodal emotion analysis tailored for real-world deployments.
引用
收藏
页码:1504 / 1530
页数:27
相关论文
共 50 条
[41]   A review on sentiment analysis and emotion detection from text [J].
Nandwani, Pansy ;
Verma, Rupali .
SOCIAL NETWORK ANALYSIS AND MINING, 2021, 11 (01)
[42]   A Review of Multimodal Interaction in Remote Education: Technologies, Applications, and Challenges [J].
Xie, Yangmei ;
Yang, Liuyi ;
Zhang, Miao ;
Chen, Sinan ;
Li, Jialong .
APPLIED SCIENCES-BASEL, 2025, 15 (07)
[43]   A systematic survey on multimodal emotion recognition using learning algorithms [J].
Ahmed, Naveed ;
Al Aghbari, Zaher ;
Girija, Shini .
INTELLIGENT SYSTEMS WITH APPLICATIONS, 2023, 17
[44]   Multimodal fusion techniques: Review, data representation, information fusion, and application areas [J].
Hangloo, Sakshini ;
Arora, Bhavna .
NEUROCOMPUTING, 2025, 649
[45]   Audio-Guided Fusion Techniques for Multimodal Emotion Analysis [J].
Shi, Pujin ;
Gao, Fei .
PROCEEDINGS OF THE 2ND INTERNATIONAL WORKSHOP ON MULTIMODAL AND RESPONSIBLE AFFECTIVE COMPUTING, MRAC 2024, 2024, :62-66
[46]   Multimodal Emotion Analysis Based on Visual, Acoustic and Linguistic Features [J].
Koren, Leon ;
Stipancic, Tomislav ;
Ricko, Andrija ;
Orsag, Luka .
SOCIAL COMPUTING AND SOCIAL MEDIA: DESIGN, USER EXPERIENCE AND IMPACT, SCSM 2022, PT I, 2022, 13315 :318-331
[47]   Quantum neural networks for multimodal sentiment, emotion, and sarcasm analysis [J].
Singh, Jaiteg ;
Bhangu, Kamalpreet Singh ;
Alkhanifer, Abdulrhman ;
Alzubi, Ahmad Ali ;
Ali, Farman .
ALEXANDRIA ENGINEERING JOURNAL, 2025, 124 :170-187
[48]   UniC: a dataset for emotion analysis of videos with multimodal and unimodal labels [J].
Du, Quanqi ;
Labat, Sofie ;
Demeester, Thomas ;
Hoste, Veronique .
LANGUAGE RESOURCES AND EVALUATION, 2025, 59 (03) :2857-2892
[49]   AttendAffectNet-Emotion Prediction of Movie Viewers Using Multimodal Fusion with Self-Attention [J].
Thao, Ha Thi Phuong ;
Balamurali, B. T. ;
Roig, Gemma ;
Herremans, Dorien .
SENSORS, 2021, 21 (24)
[50]   Multimodal Attention Network for Continuous-Time Emotion Recognition Using Video and EEG Signals [J].
Choi, Dong Yoon ;
Kim, Deok-Hwan ;
Song, Byung Cheol .
IEEE ACCESS, 2020, 8 :203814-203826