OAGknow: Self-Supervised Learning for Linking Knowledge Graphs

被引:4
作者
Liu, Xiao [1 ]
Mian, Li [2 ]
Dong, Yuxiao [3 ]
Zhang, Fanjin [1 ]
Zhang, Jing [4 ]
Tang, Jie [5 ]
Zhang, Peng [1 ]
Gong, Jibing [6 ]
Wang, Kuansan [3 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100190, Peoples R China
[2] Beijing Inst Technol, Beijing 100811, Peoples R China
[3] Microsoft Res, Redmond, WA 98052 USA
[4] Renmin Univ China, Beijing 100872, Peoples R China
[5] Tsinghua Univ, Tsinghua Bosch Joint ML Ctr, Dept Comp Sci & Technol, Beijing 100190, Peoples R China
[6] Yanshan Univ, Dept Informat Sci & Engn, Qinhuangdao 066104, Peoples R China
基金
国家重点研发计划;
关键词
Encyclopedias; Internet; Electronic publishing; Knowledge based systems; Taxonomy; Training; Encoding; Concept linking; self-supervised learning; contrastive learning; knowledge base;
D O I
10.1109/TKDE.2021.3090830
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
propose a self-supervised embedding learning framework-SelfLinKG-to link concepts in heterogeneous knowledge graphs. Without any labeled data, SelfLinKG can achieve competitive performance against its supervised counterpart, and significantly outperforms state-of-the-art unsupervised methods by 26%-50% under linear classification protocol. The essential components of SelfLinKG are local attention-based encoding and momentum contrastive learning. The former aims to learn the graph representation using an attention network, while the latter is to learn a self-supervised model across knowledge graphs using contrastive learning. SelfLinKG has been deployed to build the the new version, called OAG(know) of Open Academic Graph (OAG). All data and codes are publicly available.
引用
收藏
页码:1895 / 1908
页数:14
相关论文
共 49 条
[1]  
Bordes A, 2013, NIPS 13, P2787, DOI DOI 10.5555/2999792.2999923
[2]  
Chen MH, 2017, Arxiv, DOI arXiv:1611.03954
[3]  
Clark K, 2019, Arxiv, DOI arXiv:1906.04341
[4]  
Clark Kevin, 2019, INT C LEARN REPR
[5]  
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, DOI 10.48550/ARXIV.1810.04805]
[6]   Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion [J].
Dong, Xin Luna ;
Gabrilovich, Evgeniy ;
Heitz, Geremy ;
Horn, Wilko ;
Lao, Ni ;
Murphy, Kevin ;
Strohmann, Thomas ;
Sun, Shaohua ;
Zhang, Wei .
PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, :601-610
[7]   Duplicate record detection: A survey [J].
Elmagarmid, Ahmed K. ;
Ipeirotis, Panagiotis G. ;
Verykios, Vassilios S. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (01) :1-16
[8]  
Feng J, 2016, P COLING 2016 26 INT, P641
[9]  
Hadsell R., 2006, PROC IEEE COMPUT SOC, P1735, DOI 10.1109/CVPR.2006.100
[10]  
Han X, 2018, CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, P139