XLORE 3: A Large-Scale Multilingual Knowledge Graph from Heterogeneous Wiki Knowledge Resources

被引:0
作者
Zeng, Kaisheng [1 ]
Jin, Hailong [1 ]
Lv, Xin [1 ]
Zh, Fangwei [2 ]
Hou, Lei [1 ]
Zhang, Yi [1 ]
Pang, Fan [1 ]
Qi, Yu [1 ]
Liu, Dingxiao [1 ]
Li, Juanzi [1 ]
Feng, Ling [1 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Peking Univ, Beijing, Peoples R China
关键词
Additional Key Words and Phrases; Knowledge graph; knowledge management; knowledge fusion; knowledge completion; schema construction; entity typing; entity alignment; entity linking; ENTITY; CONSTRUCTION;
D O I
10.1145/3660521
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, knowledge graph (KG) has attracted significant attention from academia and industry, resulting in the development of numerous technologies for KG construction, completion, and application. XLORE is one of the largest multilingual KGs built from Baidu Baike and Wikipedia via a series of knowledge modeling and acquisition methods. In this article, we utilize systematic methods to improve XLORE's data quality and present its latest version, XLORE 3, which enables the effective integration and management of heterogeneous knowledge from diverse resources. Compared with previous versions, XLORE 3 has three major advantages: (1) We design a comprehensive and reasonable schema, namely XLORE ontology, which can effectively organize and manage entities from various resources. (2) We merge equivalent entities in different languages to facilitate knowledge sharing. We provide a large-scale entity linking system to establish the associations between unstructured text and structured KG. (3) We design a multi-strategy knowledge completion framework, which leverages pre-trained language models and vast amounts of unstructured text to discover missing and new facts. The resulting KG contains 446 concepts, 2,608 properties, 66 million entities, and more than 2 billion facts. It is available and downloadable online at https://www.xlore.cn/, providing a valuable resource for researchers and practitioners in various fields.
引用
收藏
页数:47
相关论文
共 149 条
[31]   A Survey on Automated Fact-Checking [J].
Guo, Zhijiang ;
Schlichtkrull, Michael ;
Vlachos, Andreas .
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2022, 10 :178-206
[32]   Universal Representation Learning of Knowledge Bases by Jointly Embedding Instances and Ontological Concepts [J].
Hao, Junheng ;
Chen, Muhao ;
Yu, Wenchao ;
Sun, Yizhou ;
Wang, Wei .
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :1709-1719
[33]   A Joint Embedding Method for Entity Alignment of Knowledge Bases [J].
Hao, Yanchao ;
Zhang, Yuanzhe ;
He, Shizhu ;
Liu, Kang ;
Zhao, Jun .
KNOWLEDGE GRAPH AND SEMANTIC COMPUTING: SEMANTIC, KNOWLEDGE, AND LINKED BIG DATA, 2016, 650 :3-14
[34]  
Heinzerline B, 2021, 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), P1772
[35]  
Hoffart J., 2011, P WWW, P229, DOI DOI 10.1145/1963192.1963296
[36]   Knowledge Graphs [J].
Hogan, Aidan ;
Blomqvist, Eva ;
Cochez, Michael ;
D'Amato, Claudia ;
de Melo, Gerard ;
Gutierrez, Claudio ;
Kirrane, Sabrina ;
Labra Gayo, Jose Emilio ;
Navigli, Roberto ;
Neumaier, Sebastian ;
Ngomo, Axel-Cyrille Ngonga ;
Polleres, Axel ;
Rashid, Sabbir M. ;
Rula, Anisa ;
Schmelzeisen, Lukas ;
Sequeda, Juan ;
Staab, Steffen ;
Zimmermann, Antoine .
ACM COMPUTING SURVEYS, 2021, 54 (04)
[37]   Exploring high-order user preference on the knowledge graph for recommender systems [J].
Wang H. ;
Zhang F. ;
Wang J. ;
Zhao M. ;
Li W. ;
Xie X. ;
Guo M. .
ACM Transactions on Information Systems, 2019, 37 (03)
[38]   A Survey of Knowledge Enhanced Pre-Trained Language Models [J].
Hu, Linmei ;
Liu, Zeyi ;
Zhao, Ziwang ;
Hou, Lei ;
Nie, Liqiang ;
Li, Juanzi .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) :1413-1430
[39]  
Hu ZW, 2022, PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, P3078
[40]  
Huang L, 2024, Arxiv, DOI [arXiv:2311.05232, DOI 10.48550/ARXIV.2311.05232]