Multimodal sentiment analysis based on fusion methods: A survey

被引:138
作者
Zhu, Linan [1 ]
Zhu, Zhechao [1 ]
Zhang, Chenwei [2 ]
Xu, Yifei [1 ]
Kong, Xiangjie [1 ]
机构
[1] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou, Peoples R China
[2] Univ Hong Kong, Sch Fac Educ, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Multimodal data; Sentiment analysis; Feature extraction; Fusion methods; NETWORK;
D O I
10.1016/j.inffus.2023.02.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment analysis is an emerging technology that aims to explore people's attitudes toward an entity. It can be applied in a variety of different fields and scenarios, such as product review analysis, public opinion analysis, psychological disease analysis, and risk assessment analysis. Traditional sentiment analysis only includes the text modality and extracts sentiment information by inferring the semantic relationship within sentences. However, some special expressions, such as irony and exaggeration, are difficult to detect via text alone. Multimodal sentiment analysis contains rich visual and acoustic information in addition to text, and uses fusion analysis to more accurately infer the implied sentiment polarity (positive, neutral, negative). The main challenge in multimodal sentiment analysis is the integration of cross-modal sentiment information, so we focus on introducing the framework and characteristics of different fusion methods. In addition, this article discusses the development status of multimodal sentiment analysis, popular datasets, feature extraction algorithms, application areas, and existing challenges. It is hoped that our work can help researchers understand the current state of research in the field of multimodal sentiment analysis, and be inspired by the useful insights provided in the article to develop effective models.
引用
收藏
页码:306 / 325
页数:20
相关论文
共 107 条
[1]   Multimodal Video Sentiment Analysis Using Deep Learning Approaches, a Survey [J].
Abdu, Sarah A. ;
Yousef, Ahmed H. ;
Salem, Ashraf .
INFORMATION FUSION, 2021, 76 :204-226
[2]  
Anand Namrata., 2015, Technical report
[3]  
[Anonymous], 2014, P 2014 C EMP METH NA, DOI DOI 10.3115/V1/D14-1162
[4]  
[Anonymous], 2011, P 13 INT C MULTIMODA
[5]  
[Anonymous], 2013, P WORKSHOP ICLR 2013
[6]  
Apala KR, 2013, 2013 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), P1209
[7]  
Arjmand M., 2021, ABS210905522 CORR
[8]   OpenFace 2.0: Facial Behavior Analysis Toolkit [J].
Baltrusaitis, Tadas ;
Zadeh, Amir ;
Lim, Yao Chong ;
Morency, Louis-Philippe .
PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, :59-66
[9]  
Barezi EJ, 2019, 4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), P260
[10]  
Bengio Y, 2001, ADV NEUR IN, V13, P932