Graph structure prefix injection transformer for multi-modal entity alignment

被引：0

作者：

Zhang, Yan ^{[1
,2
,3
,4
,5
]}

Luo, Xiangyu ^{[2
]}

Hu, Jing ^{[2
]}

Zhang, Miao ^{[1
,3
,4
]}

Xiao, Kui ^{[1
,3
,4
]}

Li, Zhifei ^{[1
,2
,3
,4
,5
]}

机构：

[1] Hubei Univ, Sch Comp Sci, Wuhan 430062, Peoples R China

[2] Hubei Univ, Sch Cyber Sci & Technol, Wuhan 430062, Peoples R China

[3] Hubei Univ, Hubei Key Lab Big Data Intelligent Anal & Applicat, Wuhan 430062, Peoples R China

[4] Hubei Univ, Key Lab Intelligent Sensing Syst & Secur, Minist Educ, Wuhan 430062, Peoples R China

[5] Hubei Univ, Hubei Prov Engn Res Ctr Intelligent Connected Vehi, Wuhan 430062, Peoples R China

来源：

INFORMATION PROCESSING & MANAGEMENT | 2025年 / 62卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Multi-modal knowledge graphs; Multi-modal entity alignment; Contrastive learning; KNOWLEDGE GRAPHS;

D O I：

10.1016/j.ipm.2024.104048

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multi-modal entity alignment aims to integrate corresponding entities across different MMKGs. However, previous studies have not adequately considered the impact of graph structural heterogeneity on EA tasks. Different MMKGs typically exhibit variations in graph structural features, leading to distinct structural representations of the same entity relationships. Additionally, the topological structure of the graph also differs. To tackle these challenges, we introduce GSIEA, the MMEA framework that integrates structural prefix injection and modality fusion. Different from other methods that directly fuse structural data with multi-modal features to perform the alignment task, GSIEA separately processes structural data and multi-modal data such as images and attributes, incorporating a prefix injection interaction module within a multi-head attention mechanism to optimize the utilization of multi-modal information and minimize the impact of graph structural differences. Additionally, GSIEA employs a convolutional enhancement module to extract fine-grained multi-modal features and computes cross-modal weights to achieve feature fusion. We conduct experimental evaluations on two public datasets, containing 12,846 and 11,199 entity pairs, respectively, demonstrating that GSIEA outperforms baseline models, with an average improvement of 3.26% in MRR and a maximum gain of 12.5%. Furthermore, the average improvement in Hits@1 is 4.96%, with a maximum increase of 16.92%. The code of our model is stored at https://github.com/HubuKG/GSIEA.

引用

页数：17

共 50 条

[21] Multi-modal entity alignment based on joint knowledge representation learning
Wang H.-Y.
Lun B.
Zhang X.-M.
Sun X.-L.
Kongzhi yu Juece/Control and Decision, 2021, 35 (12): : 2855 - 2864
[22] Multi-Modal Entity Alignment Using Uncertainty Quantification for Modality Importance
Hama, Kenta
Matsubara, Takashi
IEEE ACCESS, 2023, 11 : 28479 - 28489
[23] MFIEA: entity alignment through multi-modal feature interaction and knowledge facts
Zhang, Xiaoming
Lv, Menglong
Wang, Huiyong
Naseriparsa, Mehdi
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2025,
[24] Rethinking Uncertainly Missing and Ambiguous Visual Modality in Multi-Modal Entity Alignment
Chen, Zhuo
Guo, Lingbing
Fang, Yin
Zhang, Yichi
Chen, Jiaoyan
Pan, Jeff Z.
Li, Yangning
Chen, Huajun
Zhang, Wen
SEMANTIC WEB, ISWC 2023, PART I, 2023, 14265 : 121 - 139
[25] Self-Supervised Entity Alignment Based on Multi-Modal Contrastive Learning
Bo Liu
Ruoyi Song
Yuejia Xiang
Junbo Du
Weijian Ruan
Jinhui Hu
IEEE/CAAJournalofAutomaticaSinica, 2022, 9 (11) : 2031 - 2033
[26] Self-Supervised Entity Alignment Based on Multi-Modal Contrastive Learning
Liu, Bo
Song, Ruoyi
Xiang, Yuejia
Du, Junbo
Ruan, Weijian
Hu, Jinhui
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2022, 9 (11) : 2031 - 2033
[27] Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance
Zhang, Dong
Wei, Suzhong
Li, Shoushan
Wu, Hanqian
Zhu, Qiaoming
Zhou, Guodong
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14347 - 14355
[28] Boosting Entity-Aware Image Captioning With Multi-Modal Knowledge Graph
Zhao, Wentian
Wu, Xinxiao
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2659 - 2670
[29] Multi-modal Entity Alignment via Position-enhanced Multi-label Propagation
Tang, Wei
Wang, Yuanyi
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 366 - 375
[30] Hierarchical Multi-Modal Prompting Transformer for Multi-Modal Long Document Classification
Liu, Tengfei
Hu, Yongli
Gao, Junbin
Sun, Yanfeng
Yin, Baocai
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 6376 - 6390

← 1 2 3 4 5 →