CLUSTERING AND BOOTSTRAPPING BASED FRAMEWORK FOR NEWS KNOWLEDGE BASE COMPLETION

被引:0
|
作者
Srinivasa, K. [1 ]
Thilagam, P. Santhi [1 ]
机构
[1] Natl Inst Technol Karnataka, Dept Comp Sci & Engn, NH 66, Mangalore 575025, India
关键词
Knowledge base completion; natural language processing; information extraction; 1002triples; bootstrap; cluster; INFORMATION EXTRACTION; CONSTRUCTION;
D O I
10.31577/cai_2021_2_318
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extracting the facts, namely entities and relations, from unstructured sources is an essential step in any knowledge base construction. At the same time, it is also necessary to ensure the completeness of the knowledge base by incremen-tally extracting the new facts from various sources. To date, the knowledge base completion is studied as a problem of knowledge refinement where the missing facts are inferred by reasoning about the information already present in the knowledge base. However, facts missed while extracting the information from multilingual sources are ignored. Hence, this work proposed a generic framework for know-ledge base completion to enrich a knowledge base of crime-related facts extracted from online news articles in the English language, with the facts extracted from low resourced Indian language Hindi news articles. Using the framework, informa-tion from any low-resourced language news articles can be extracted without using language-specific tools like POS tags and using an appropriate machine translation tool. To achieve this, a clustering algorithm is proposed, which explores the redun-dancy among the bilingual collection of news articles by representing the clusters with knowledge base facts unlike the existing Bag of Words representation. From each cluster, the facts extracted from English language articles are bootstrapped to extract the facts from comparable Hindi language articles. This way of boot-strapping within the cluster helps to identify the sentences from a low-resourced language that are enriched with new information related to the facts extracted from a high-resourced language like English. The empirical result shows that the proposed clustering algorithm produced more accurate and high-quality clusters for monolingual and cross-lingual facts, respectively. Experiments also proved that the proposed framework achieves a high recall rate in extracting the new facts from Hindi news articles.
引用
收藏
页码:318 / 340
页数:23
相关论文
共 50 条
  • [21] Knowledge Base Completion by Variational Bayesian Neural Tensor Decomposition
    Lirong He
    Bin Liu
    Guangxi Li
    Yongpan Sheng
    Yafang Wang
    Zenglin Xu
    Cognitive Computation, 2018, 10 : 1075 - 1084
  • [22] A Semantic Knowledge-Based Framework for Information Extraction and Exploration
    Aljamel, Abduladem
    Osman, Taha
    Thakker, Dhavalkumar
    INTERNATIONAL JOURNAL OF DECISION SUPPORT SYSTEM TECHNOLOGY, 2021, 13 (02) : 85 - 109
  • [23] A convolutional neural network-based model for knowledge base completion and its application to search personalization
    Dai Quoc Nguyen
    Dat Quoc Nguyen
    Tu Dinh Nguyen
    Dinh Phung
    SEMANTIC WEB, 2019, 10 (05) : 947 - 960
  • [24] Siamese Pre-Trained Transformer Encoder for Knowledge Base Completion
    Mengyao Li
    Bo Wang
    Jing Jiang
    Neural Processing Letters, 2021, 53 : 4143 - 4158
  • [25] Knowledge base completion by learning pairwise-interaction differentiated embeddings
    Yu Zhao
    Sheng Gao
    Patrick Gallinari
    Jun Guo
    Data Mining and Knowledge Discovery, 2015, 29 : 1486 - 1504
  • [26] Siamese Pre-Trained Transformer Encoder for Knowledge Base Completion
    Li, Mengyao
    Wang, Bo
    Jiang, Jing
    NEURAL PROCESSING LETTERS, 2021, 53 (06) : 4143 - 4158
  • [27] Knowledge Base Completion via Rule-Enhanced Relational Learning
    Guo, Shu
    Ding, Boyang
    Wang, Quan
    Wang, Lihong
    Wang, Bin
    KNOWLEDGE GRAPH AND SEMANTIC COMPUTING: SEMANTIC, KNOWLEDGE, AND LINKED BIG DATA, 2016, 650 : 219 - 227
  • [28] ZeroKBC: A Comprehensive Benchmark for Zero-Shot Knowledge Base Completion
    Chen, Pei
    Yao, Wenlin
    Zhang, Hongming
    Pan, Xiaoman
    Yu, Dian
    Yu, Dong
    Chen, Jianshu
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW, 2022, : 885 - 890
  • [29] Knowledge base completion by learning pairwise-interaction differentiated embeddings
    Zhao, Yu
    Gao, Sheng
    Gallinari, Patrick
    Guo, Jun
    DATA MINING AND KNOWLEDGE DISCOVERY, 2015, 29 (05) : 1486 - 1504
  • [30] Improved Knowledge Base Completion by the Path-Augmented TransR Model
    Huang, Wenhao
    Li, Ge
    Jin, Zhi
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2017): 10TH INTERNATIONAL CONFERENCE, KSEM 2017, MELBOURNE, VIC, AUSTRALIA, AUGUST 19-20, 2017, PROCEEDINGS, 2017, 10412 : 149 - 159