XLORE 3: A Large-Scale Multilingual Knowledge Graph from Heterogeneous Wiki Knowledge Resources

被引:0
作者
Zeng, Kaisheng [1 ]
Jin, Hailong [1 ]
Lv, Xin [1 ]
Zh, Fangwei [2 ]
Hou, Lei [1 ]
Zhang, Yi [1 ]
Pang, Fan [1 ]
Qi, Yu [1 ]
Liu, Dingxiao [1 ]
Li, Juanzi [1 ]
Feng, Ling [1 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Peking Univ, Beijing, Peoples R China
关键词
Additional Key Words and Phrases; Knowledge graph; knowledge management; knowledge fusion; knowledge completion; schema construction; entity typing; entity alignment; entity linking; ENTITY; CONSTRUCTION;
D O I
10.1145/3660521
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, knowledge graph (KG) has attracted significant attention from academia and industry, resulting in the development of numerous technologies for KG construction, completion, and application. XLORE is one of the largest multilingual KGs built from Baidu Baike and Wikipedia via a series of knowledge modeling and acquisition methods. In this article, we utilize systematic methods to improve XLORE's data quality and present its latest version, XLORE 3, which enables the effective integration and management of heterogeneous knowledge from diverse resources. Compared with previous versions, XLORE 3 has three major advantages: (1) We design a comprehensive and reasonable schema, namely XLORE ontology, which can effectively organize and manage entities from various resources. (2) We merge equivalent entities in different languages to facilitate knowledge sharing. We provide a large-scale entity linking system to establish the associations between unstructured text and structured KG. (3) We design a multi-strategy knowledge completion framework, which leverages pre-trained language models and vast amounts of unstructured text to discover missing and new facts. The resulting KG contains 446 concepts, 2,608 properties, 66 million entities, and more than 2 billion facts. It is available and downloadable online at https://www.xlore.cn/, providing a valuable resource for researchers and practitioners in various fields.
引用
收藏
页数:47
相关论文
共 149 条
[1]  
2023, Arxiv, DOI [arXiv:2303.08774, DOI 10.48550/ARXIV.2303.08774]
[2]   Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation [J].
Ai, Qingyao ;
Azizi, Vahid ;
Chen, Xu ;
Zhang, Yongfeng .
ALGORITHMS, 2018, 11 (09)
[3]  
Aly R., 2021, 35 C NEURAL INFORM P
[4]  
Anil R, 2023, Arxiv, DOI [arXiv:2305.10403, 10.48550/arXiv.2305.10403]
[5]  
[Anonymous], 2007, P 16 INT C WORLD WID
[6]  
[Anonymous], 2014, 7 BIENNIAL C INNOVAT
[7]  
Araci D, 2019, Arxiv, DOI [arXiv:1908.10063, 10.48550/ARXIV.1908.10063, DOI 10.48550/ARXIV.1908.10063]
[8]  
Augenstein I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P4685
[9]  
Balazevic I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P5185
[10]  
Bollacker Kurt, 2008, P 2008 ACM SIGMOD IN, P1247