Multimodal Emotion Classification With Multi-Level Semantic Reasoning Network

被引:11
作者
Zhu, Tong [1 ]
Li, Leida [2 ]
Yang, Jufeng [3 ]
Zhao, Sicheng [4 ]
Xiao, Xiao [5 ]
机构
[1] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Peoples R China
[2] Xidian Univ, Sch Artificial Intelligence, Xian 710071, Peoples R China
[3] Nankai Univ, Sch Comp & Control Engn, Tianjin 300350, Peoples R China
[4] Tsinghua Univ, BNRist, Beijing 100084, Peoples R China
[5] Xidian Univ, Sch Telecommun Engn, Xidian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Sentiment analysis; Visualization; Cognition; Feature extraction; Task analysis; Social networking (online); Multimodal emotion classification; Graph attention module; Semantic reasoning; SENTIMENT ANALYSIS;
D O I
10.1109/TMM.2022.3214989
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, people are accustomed to posting images and associated text for expressing their emotions on social networks. Accordingly, multimodal sentiment analysis has drawn increasingly more attention. Most of the existing image-text multimodal sentiment analysis methods simply predict the sentiment polarity. However, the same sentiment polarity may correspond to quite different emotions, such as happiness vs. excitement and disgust vs. sadness. Therefore, sentiment polarity is ambiguous and may not convey the accurate emotions that people want to express. Psychological research has shown that objects and words are emotional stimuli and that semantic concepts can affect the role of stimuli. Inspired by this observation, this paper presents a new MUlti-Level SEmantic Reasoning network (MULSER) for fine-grained image-text multimodal emotion classification, which not only investigates the semantic relationship among objects and words respectively, but also explores the semantic relationship between regional objects and global concepts. For image modality, we first build graphs to extract objects and global representation, and employ a graph attention module to perform bilevel semantic reasoning. Then, a joint visual graph is built to learn the regional-global semantic relations. For text modality, we build a word graph and further apply graph attention to reinforce the interdependencies among words in a sentence. Finally, a cross-modal attention fusion module is proposed to fuse semantic-enhanced visual and textual features, based on which informative multimodal representations are obtained for fine-grained emotion classification. The experimental results on public datasets demonstrate the superiority of the proposed model over the state-of-the-art methods.
引用
收藏
页码:6868 / 6880
页数:13
相关论文
共 60 条
  • [51] Cross-modality Consistent Regression for Joint Visual-Textual Sentiment Analysis of Social Multimedia
    You, Quanzeng
    Luo, Jiebo
    Jin, Hailin
    Yang, Jianchao
    [J]. PROCEEDINGS OF THE NINTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'16), 2016, : 13 - 22
  • [52] A Multifaceted Approach to Social Multimedia-Based Prediction of Elections
    You, Quanzeng
    Cao, Liangliang
    Cong, Yang
    Zhang, Xianchao
    Luo, Jiebo
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (12) : 2271 - 2280
  • [53] Yu H, 2003, PROCEEDINGS OF THE 2003 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, P129
  • [54] Yuan J, 2013, Proceedings of the 2nd International Workshop on Issues of Sentiment Discovery and Opinion Mining, WISDOM '13, P1, DOI 10.1145/2502069.2502079
  • [55] MULTI-GRANULARITY REASONING FOR SOCIAL RELATION RECOGNITION FROM IMAGES
    Zhang, Meng
    Liu, Xinchen
    Liu, Wu
    Zhou, Anfu
    Ma, Huadong
    Mei, Tao
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1618 - 1623
  • [56] Emotion Recognition From Multiple Modalities: Fundamentals and methodologies
    Zhao, Sicheng
    Jia, Guoli
    Yang, Jufeng
    Ding, Guiguang
    Keutzer, Kurt
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2021, 38 (06) : 59 - 73
  • [57] Affective Image Content Analysis: Two Decades Review and New Perspectives
    Zhao, Sicheng
    Yao, Xingxu
    Yang, Jufeng
    Jia, Guoli
    Ding, Guiguang
    Chua, Tat-Seng
    Schuller, Bjorn W.
    Keutzer, Kurt
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 6729 - 6751
  • [58] Exploring Principles-of-Art Features For Image Emotion Recognition
    Zhao, Sicheng
    Gao, Yue
    Jiang, Xiaolei
    Yao, Hongxun
    Chua, Tat-Seng
    Sun, Xiaoshuai
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 47 - 56
  • [59] Affective Image Retrieval via Multi-Graph Learning
    Zhao, Sicheng
    Yao, Hongxun
    Yang, You
    Zhang, Yanhao
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 1025 - 1028
  • [60] Zhu XG, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P3595