Capturing the Concept Projection in Metaphorical Memes for Downstream Learning Tasks

被引：1

作者：

Acharya, Sathwik ^{[1
]}

Das, Bhaskarjyoti ^{[2
]}

Sudarshan, T. S. B. ^{[1
]}

机构：

[1] PES Univ, Dept Comp Sci & Engn, Bengaluru 560085, Karnataka, India

[2] PES Univ, Dept Comp Sci & Engn AI & ML, Bengaluru 560085, Karnataka, India

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Memes; metaphor; concept projection; cognitive computing; multimodal machine learning; knowledge graph; large language models; LANGUAGE;

D O I：

10.1109/ACCESS.2023.3347988

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Metaphorical memes, where a source concept is projected into a target concept, are an essential construct in figurative language. In this article, we present a novel approach for downstream learning tasks on metaphorical multimodal memes. Our proposed framework replaces traditional methods using metaphor annotations with a metaphor-capturing mechanism. Besides using the significant zero-shot learning capability of state-of-the-art pretrained encoders, this work introduces an alternative external knowledge enhancement strategy based on ChatGPT (chatbot generative pretrained transformer), demonstrating its effectiveness in bridging the intermodal semantic gap. We propose a new concept projection process consisting of three distinct components to capture the intramodal knowledge and intermodal concept gap in the forms of text modality embedding, visual modality embedding, and concept projection embedding. This approach leverages the attention mechanism of the Graph Attention Network for fusing the common aspects of external knowledge related to the knowledge in the text and image modality to implement the concept projection process. Our experimental results demonstrate the superiority of our proposed approach compared to existing methods.

引用

页码：1250 / 1265

页数：16

共 121 条

[1] A Survey of Figurative Language and Its Computational Detection in Online Social Networks [J].

Abulaish, Muhammad ;

Kamal, Ashraf ;

Zaki, Mohammed J. .

ACM TRANSACTIONS ON THE WEB, 2020, 14 (01)

[2]

Afridi Tariq Habib, 2021, Innovations in Smart Cities Applications. Proceedings of the 5th International Conference on Smart City Applications. Lecture Notes in Networks and Systems (LNNS 183), P1451, DOI 10.1007/978-3-030-66840-2_109

[3] MetaCLUE: Towards Comprehensive Visual Metaphors Research [J].

Akula, Arjun R. ;

Driscoll, Brendan ;

Narayana, Pradyumna ;

Changpinyo, Soravit ;

Jia, Zhiwei ;

Damle, Suyash ;

Pruthi, Garima ;

Basu, Sugato ;

Guibas, Leonidas ;

Freeman, William T. ;

Li, Yuanzhen ;

Jampani, Varun .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :23201-23211

[4] Multimodal Machine Learning: A Survey and Taxonomy [J].

Baltrusaitis, Tadas ;

Ahuja, Chaitanya ;

Morency, Louis-Philippe .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (02) :423-443

[5]

Chauhan DS, 2020, 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), P281

[6] HGMF: Heterogeneous Graph-based Fusion for Multimodal Data with Incompleteness [J].

Chen, Jiayi ;

Zhang, Aidong .

KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :1295-1305

[7] Cross-modal Ambiguity Learning for Multimodal Fake News Detection [J].

Chen, Yixuan ;

Li, Dongsheng ;

Zhang, Peng ;

Sui, Jie ;

Lv, Qin ;

Lu, Tun ;

Shang, Li .

PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, :2897-2905

[8] Drug-Target Interaction Prediction Using Multi-Head Self-Attention and Graph Attention Network [J].

Cheng, Zhongjian ;

Yan, Cheng ;

Wu, Fang-Xiang ;

Wang, Jianxin .

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (04) :2208-2218

[9]

Crisp P, 2007, METAPHOR SYMBOL, V22, P1

[10]

Dai HX, 2023, Arxiv, DOI [arXiv:2302.13007, 10.48550/arXiv.2302.13007]

← 1 2 3 4 5 6 7 8 9 10 →