Contrastive Multi-Modal Knowledge Graph Representation Learning

被引:12
作者
Fang, Quan [1 ]
Zhang, Xiaowei [2 ]
Hu, Jun [1 ]
Wu, Xian [3 ]
Xu, Changsheng [1 ,4 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
[2] Zhengzhou Univ, Zhengzhou 450001, Peoples R China
[3] Tencent Med AI Lab, Beijing 100080, Peoples R China
[4] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
基金
中国国家自然科学基金;
关键词
Knowledge graph; multimedia; graph neural network; contrastive learning; NETWORK;
D O I
10.1109/TKDE.2022.3220625
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Representation learning of knowledge graphs (KGs) aims to embed both entities and relations as vectors in a continuous low-dimensional space, which has facilitated various applications such as link prediction and entity retrieval. Most existing KG embedding methods focus on modeling the structured fact triples independently and ignore the multi-type relations among triples as well as the variety of data types (e.g., texts and images) associated with entities in KGs, and thus fail to capture the complex and multi-modal information that is inherently inside the entity-relation triples. In this paper, we propose a novel approach for knowledge graph embedding named Contrastive Multi-modal Graph Neural Network (CMGNN), which can encapsulate comprehensive features from multi-modal content descriptions of entities and high-order connectivity structures. Specifically, CMGNN first learns entity embeddings from multi-modal content and then contrasts encodings from multi-relational local neighbors and high-order connectivities to obtain latent representations of entities and relations simultaneously. Experimental results demonstrate that CMGNN can effectively model the multi-modalities and multi-type structures in KGs, and significantly outperforms existing state-of-the-art methods on benchmark datasets for the tasks of link prediction and entity classification.
引用
收藏
页码:8983 / 8996
页数:14
相关论文
共 75 条
[31]  
King DB, 2015, ACS SYM SER, V1214, P1, DOI 10.1021/bk-2015-1214.ch001
[32]  
Kipf T. N., 2017, 4 P INT C LEARN REPR, P1
[33]   DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia [J].
Lehmann, Jens ;
Isele, Robert ;
Jakob, Max ;
Jentzsch, Anja ;
Kontokostas, Dimitris ;
Mendes, Pablo N. ;
Hellmann, Sebastian ;
Morsey, Mohamed ;
van Kleef, Patrick ;
Auer, Soeren ;
Bizer, Christian .
SEMANTIC WEB, 2015, 6 (02) :167-195
[34]   Learning Knowledge Graph Embedding With Heterogeneous Relation Attention Networks [J].
Li, Zhifei ;
Liu, Hai ;
Zhang, Zhaoli ;
Liu, Tingting ;
Xiong, Neal N. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) :3961-3973
[35]  
Lin YK, 2015, AAAI CONF ARTIF INTE, P2181
[36]  
Liu HX, 2017, PR MACH LEARN RES, V70
[37]   MMKG: Multi-modal Knowledge Graphs [J].
Liu, Ye ;
Li, Hui ;
Garcia-Duran, Alberto ;
Niepert, Mathias ;
Onoro-Rubio, Daniel ;
Rosenblum, David S. .
SEMANTIC WEB, ESWC 2019, 2019, 11503 :459-474
[38]  
Ma Tengyu, 2017, A simple but tough-to-beat baseline for sentence embeddings
[39]  
Mikolov T., 2013, Advances in neural information processing systems, P3111
[40]  
Nathani D, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P4710