Research on Multimodal Sentiment Classification of Internet Memes Based on Transformer

被引:0
作者
Chi, Shengnan [1 ]
Sang, Guoming [1 ]
Shi, Xian [1 ]
机构
[1] Dalian Maritime Univ, Dalian 116000, Liaoning, Peoples R China
来源
PROCEEDINGS OF 2024 3RD INTERNATIONAL CONFERENCE ON CRYPTOGRAPHY, NETWORK SECURITY AND COMMUNICATION TECHNOLOGY, CNSCT 2024 | 2024年
关键词
Transformer; Sentiment Analysis; Multi-modal; Internet Memes;
D O I
10.1145/3673277.3673354
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the past few years, internet memes have emerged as one of the most widely shared content on social media platforms. People use memes to express their emotional states, whether it's sharing opinions, conveying viewpoints, or showcasing attitudes. However, traditional methods for sentiment analysis of memes rely on directly feeding image and text features into fully connected layers and a classification layer with softmax activation. This approach, which involves directly connecting extracted image features in a multimodal fashion, overlooks the global context and semantic information in images, leading to a decline in sentiment analysis performance. To address these issues, this paper proposes a multimodal sentiment analysis framework named BERES. The framework leverages CRNN+CTC technology to extract text information from memes and utilizes the BERT language model and ResNet50 to learn text and visual features of meme images. To enhance the model's representation capability for input data, we introduce a Transformer-based visual enhancement module. Subsequently, by concatenating text features and image sequence features, they are input into a fusion layer consisting of six Transformer-Encoder layers to achieve a deeper fusion of text and image features. Extensive experiments on publicly available datasets demonstrate that the proposed model outperforms existing multimodal models.
引用
收藏
页码:445 / 450
页数:6
相关论文
共 10 条
  • [1] A Survey on Multi-modal Summarization
    Jangra, Anubhav
    Mukherjee, Sourajit
    Jatowt, Adam
    Saha, Sriparna
    Hasanuzzaman, Mohammad
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (13S)
  • [2] Jannat Nusratul, 2022, 2022 25th International Conference on Computer and Information Technology (ICCIT), P791, DOI 10.1109/ICCIT57492.2022.10054644
  • [3] Kashif M, 2023, Arxiv, DOI arXiv:2309.13354
  • [4] Liu Yiyi, 2023, P 2022 6 INT C EL IN, P196
  • [5] Ouaari Sofiane, 2022, 2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS), P285, DOI 10.1109/CITDS54976.2022.9914260
  • [6] Pramanick S, 2021, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, P4439
  • [7] Pranesh Raj Ratn, 2020, Computer Science
  • [8] Sharma C, 2020, PROCEEDINGS OF THE FOURTEENTH WORKSHOP ON SEMANTIC EVALUATION, P759
  • [9] Sharma Vaibhav, 2022, Meme Detection For Sentiment Analysis and Human Robot Interactions Using Multiple Modes, P1
  • [10] Yuan L, 2020, PROCEEDINGS OF THE FOURTEENTH WORKSHOP ON SEMANTIC EVALUATION, P916