LD-MAN: Layout-Driven Multimodal Attention Network for Online News Sentiment Recognition

被引:36
|
作者
Guo, Wenya [1 ]
Zhang, Ying [1 ]
Cai, Xiangrui [2 ]
Meng, Lei [3 ]
Yang, Jufeng [1 ]
Yuan, Xiaojie [1 ]
机构
[1] Nankai Univ, Tianjin Key Lab Network & Data Secur Technol, Coll Comp Sci, Tianjin 300350, Peoples R China
[2] Nankai Univ, Coll Cyber Sci, Tianjin 300350, Peoples R China
[3] Natl Univ Singapore, Sch Comp, NUS Tsinghua Southampton Ctr Extreme Search NExT, Singapore 117417, Singapore
关键词
Sentiment analysis; Visualization; Layout; Feature extraction; Analytical models; Neural networks; Image recognition; Multimodal sentiment recognition; online news; attention mechanism; article layout; CLASSIFICATION; EXTRACTION; FUSION; MODEL;
D O I
10.1109/TMM.2020.3003648
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The prevailing use of both images and text to express opinions on the web leads to the need for multimodal sentiment recognition. Some commonly used social media data containing short text and few images, such as tweets and product reviews, have been well studied. However, it is still challenging to predict the readers' sentiment after reading online news articles, since news articles often have more complicated structures, e.g., longer text and more images. To address this problem, we propose a layout-driven multimodal attention network (LD-MAN) to recognize news sentiment in an end-to-end manner. Rather than modeling text and images individually, LD-MAN uses the layout of online news to align images with the corresponding text. Specifically, it exploits a set of distance-based coefficients to model the image locations and measure the contextual relationship between images and text. LD-MAN then learns the affective representations of the articles from the aligned text and images using a multimodal attention mechanism. Considering the lack of relevant datasets in this field, we collect two multimodal online news datasets, containing a total of 14,566 articles with 56,260 images and 251,202 words. Experimental results demonstrate that the proposed method performs favorably compared with state-of-the-art approaches. We will release all the codes, models and datasets to the community.
引用
收藏
页码:1785 / 1798
页数:14
相关论文
共 3 条
  • [1] CLIP-driven attention network for multimodal sentiment analysisCLIP-driven attention network for multimodal sentiment analysisJ. Lv et al.
    Jialun Lv
    Qimeng Yang
    Shengwei Tian
    Bo Liu
    Long Yu
    The Journal of Supercomputing, 81 (8)
  • [2] Multimodal Cross-Attention Bayesian Network for Social News Emotion Recognition
    Wang, Xinzhi
    Li, Mengyue
    Chang, Yudong
    Luo, Xiangfeng
    Yao, Yige
    Li, Zhichao
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [3] News-driven stock market index prediction based on trellis network and sentiment attention mechanism
    Liu, Wen-Jie
    Ge, Ye -Bo
    Gu, Yu -Chen
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 250