Temporal knowledge extraction from large-scale text corpus

被引:9
作者
Liu, Yu [1 ]
Hua, Wen [1 ]
Zhou, Xiaofang [1 ]
机构
[1] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld, Australia
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2021年 / 24卷 / 01期
关键词
Temporal knowledge harvesting; Temporal patterns; Temporal facts; Knowledge base; BASE;
D O I
10.1007/s11280-020-00836-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowledge, in practice, is time-variant and many relations are only valid for a certain period of time. This phenomenon highlights the importance of harvesting temporal-aware knowledge, i.e., the relational facts coupled with their valid temporal interval. Inspired by pattern-based information extraction systems, we resort to temporal patterns to extract time-aware knowledge from free text. However, pattern design is extremely laborious and time consuming even for a single relation, and free text is usually ambiguous which makes temporal instance extraction extremely difficult. Therefore, in this work, we study the problem of temporal knowledge extraction with two steps: (1) temporal pattern extraction by automatically analysing a large-scale text corpus with a small number of seed temporal facts, (2) temporal instance extraction by applying the identified temporal patterns. For pattern extraction, we introduce various techniques, including corpus annotation, pattern generation, scoring and clustering, to improve both accuracy and coverage of the extracted patterns. For instance extraction, we propose a double-check strategy to improve the accuracy and a set of node-extension rules to improve the coverage. We conduct extensive experiments on real world datasets and compared with state-of-the-art systems. Experimental results verify the effectiveness of our proposed methods for temporal knowledge harvesting.
引用
收藏
页码:135 / 156
页数:22
相关论文
共 50 条
[21]   Exploring Online Perceptions of Justice in Large-Scale Infrastructure Projects: Temporal Patterns, Sentiment Characteristics, and Topic Changes [J].
Wang, Yang ;
Chen, Jisheng ;
Shen, Chen .
JOURNAL OF MANAGEMENT IN ENGINEERING, 2024, 40 (01)
[22]   An Integrated Fault Locating Model for Large-scale Information Systems Based on Neural Network Ensemble and Knowledge Base [J].
Chen, Zhi-feng ;
Zhu, Ming-liang ;
Peng, Min-jing .
INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONIC ENGINEERING (EEE 2014), 2014, :372-376
[23]   Knowledge Extraction Method for Power Grid Fault Text Based on Ontology [J].
Peng, Bo ;
Lai, Ji ;
Hao, Yanru ;
Yuwen, Mengke ;
Xiao, Ding .
2020 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL, AUTOMATION AND MECHANICAL ENGINEERING, 2020, 1626
[24]   Large-Scale CRISPR Screen of LDLR Pathogenic Variants [J].
Li, Mengjing ;
Ma, Lerong ;
Chen, Yiwu ;
Li, Jianing ;
Wang, Yanbing ;
You, Wenni ;
Yuan, Hongming ;
Tang, Xiaochun ;
Ouyang, Hongsheng ;
Pang, Daxin .
RESEARCH, 2023, 6
[25]   Large-Scale Elucidation of Drug Response Pathways in Humans [J].
Silberberg, Yael ;
Gottlieb, Assaf ;
Kupiec, Martin ;
Ruppin, Eytan ;
Sharan, Roded .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (02) :163-174
[26]   Resilience management during large-scale epidemic outbreaks [J].
Massaro, Emanuele ;
Ganin, Alexander ;
Perra, Nicola ;
Linkov, Igor ;
Vespignani, Alessandro .
SCIENTIFIC REPORTS, 2018, 8
[27]   Computational Methods for Analysis of Large-Scale CRISPR Screens [J].
Lin, Xueqiu ;
Chemparathy, Augustine ;
La Russa, Marie ;
Daley, Timothy ;
Qi, Lei S. .
ANNUAL REVIEW OF BIOMEDICAL DATA SCIENCE, VOL 3, 2020, 2020, 3 :137-162
[29]   Application of Expert Systems for Management of Large-scale Organizations [J].
Proletersky, A., V ;
Neusipin, K. A. ;
Fang, Ke ;
Aleksandrov, A. A. .
2013 INTERNATIONAL CONFERENCE ON ECONOMIC, BUSINESS MANAGEMENT AND EDUCATION INNOVATION (EBMEI 2013), VOL 21, 2013, 21 :178-181
[30]   EXTRACTION OF OBJECTIVE KNOWLEDGE FROM INTERNET [J].
Penev, Ivaylo ;
Penev, Plamen .
MATHEMATICS AND INFORMATICS, 2012, 55 (06) :603-612