MEFE: A Multi-fEature Knowledge Fusion and Evaluation Method Based on BERT

被引:1
作者
Ji, Yimu [1 ,2 ,3 ,4 ,5 ]
Hu, Lin [1 ,3 ]
Liu, Shangdong [1 ,2 ,3 ,4 ,5 ]
Xu, Zhengyang [1 ,3 ]
Liu, Yanlan [1 ,3 ]
Liu, Kaihang [1 ,3 ]
Tang, Shuning [1 ,3 ]
Liu, Qiang [1 ,3 ]
Xiao, Wan [3 ,4 ,6 ]
机构
[1] Nanjing Univ Posts & Telecommun, Sch Comp Sci, Nanjing 210023, Peoples R China
[2] Jiangsu High Technol Res Key Lab Wireless Sensor, Nanjing 210003, Jiangsu, Peoples R China
[3] Nanjing Univ Posts & Telecommun, InstHigh Performance Comp & Bigdata, Nanjing 210003, Jiangsu, Peoples R China
[4] Nanjing Ctr HPC China, Nanjing 210003, Jiangsu, Peoples R China
[5] Jiangsu HPC & Intelligent Proc Engineer Res Ctr, Nanjing 210003, Jiangsu, Peoples R China
[6] Nanjing Univ Posts & Telecommun, Coll Educ Sci & Technol, Nanjing 210023, Peoples R China
来源
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT II | 2020年 / 12453卷
基金
国家重点研发计划;
关键词
Multi-source knowledge base; Knowledge fusion; Vectorization of category labels; BERT; Quality evaluation;
D O I
10.1007/978-3-030-60239-0_30
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Knowledge fusion is an important part of constructing a knowledge graph. In recent years, with the development of major knowledge bases, the integration of multi-source knowledge bases is the focus and difficulty in the field of knowledge fusion. Due to the large differences in knowledge base structure, the efficiency and accuracy of fusion are not high. In response to this problem, this paper proposes MEFE (Multi-fEature Knowledge Fusion and Evaluation Method) based on BERT. MEFE comprehensively considers the attributes, descriptions and category characteristics of entities to perform knowledge fusion on multi-source knowledge bases. Firstly, MEFE uses entity category tags to build a category dictionary. Then, it vectorizes the category tags based on the dictionary and clusters the entities according to the category tags. Finally it uses BERT (Bidirectional Encoder Representation from Transformers) to calculate the entity similarity for the entity pairs in the same group. We calculate entity redundancy rate and information loss rate of knowledge base according to the fusion result, so as to evaluate the quality of the knowledge base. Experiments show that MEFE effectively improves the efficiency of knowledge fusion through clustering, and the use of BERT promotes the accuracy of fusion.
引用
收藏
页码:449 / 462
页数:14
相关论文
共 15 条
[1]  
Carlson A, 2010, AAAI CONF ARTIF INTE, P1306
[2]  
Cohen W. W., 2002, P 8 ACM SIGKDD INT C, P475
[3]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[4]   Unsupervised Entity Alignment Using Attribute Triples and Relation Triples [J].
He, Fuzhen ;
Li, Zhixu ;
Qiang, Yang ;
Liu, An ;
Liu, Guanfeng ;
Zhao, Pengpeng ;
Zhao, Lei ;
Zhang, Min ;
Chen, Zhigang .
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2019), PT I, 2019, 11446 :367-382
[5]   YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia [J].
Hoffart, Johannes ;
Suchanek, Fabian M. ;
Berberich, Klaus ;
Weikum, Gerhard .
ARTIFICIAL INTELLIGENCE, 2013, 194 :28-61
[6]  
Lacoste-Julien S, 2013, 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), P572
[7]   DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia [J].
Lehmann, Jens ;
Isele, Robert ;
Jakob, Max ;
Jentzsch, Anja ;
Kontokostas, Dimitris ;
Mendes, Pablo N. ;
Hellmann, Sebastian ;
Morsey, Mohamed ;
van Kleef, Patrick ;
Auer, Soeren ;
Bizer, Christian .
SEMANTIC WEB, 2015, 6 (02) :167-195
[8]  
McCallum Andrew, 2005, PROC C ADV NEURAL IN, P905
[9]  
Niu X, 2011, LECT NOTES COMPUT SC, V7032, P205, DOI 10.1007/978-3-642-25093-4_14
[10]  
Solemn L.G., 2016, COMPUT RES DEV, V53, P165