Mining Temporal Evolution of Knowledge Graphs and Genealogical Features for Literature-based Discovery Prediction

被引:18
作者
Choudhury, Nazim [1 ]
Faisal, Fahim [2 ]
Khushi, Matloob [3 ]
机构
[1] Univ S Florida, Dept Comp Sci & Engn, Tampa, FL 33620 USA
[2] George Mason Univ, Dept Comp Sci, Fairfax, VA 22030 USA
[3] Univ Sydney, Sch Comp Sci, Sydney, NSW 2006, Australia
关键词
Literature-based Knowledge Discovery; Dynamic Supervised Link Prediction; Keyword Co-occurrence Network (KCN); Genealogical Community; Weighted Temporal Citation; LINK PREDICTION; NETWORK; RANKING; SYSTEM;
D O I
10.1016/j.joi.2020.101057
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Literature-based discovery process identifies the important but implicit relations among information embedded in published literature. Existing techniques from Information Retrieval (IR) and Natural Language Processing (NLP) attempt to identify the hidden or unpublished connections between information concepts within published literature, however, these techniques overlooked the concept of predicting the future and emerging relations among scientific knowledge components such as author selected keywords encapsulated within the literature. Keyword Co-occurrence Network (KCN), built upon author selected keywords, is considered as a knowledge graph that focuses both on these knowledge components and knowledge structure of a scientific domain by examining the relationships between knowledge entities. Using data from two multidisciplinary research domains other than the bio-medical domain, and capitalizing on bibliometrics, the dynamicity of temporal KCNs, and a recurrent neural network, this study develops some novel features supportive for the prediction of the future literature-based discoveries - the emerging connections (co-appearances in the same article) among keywords. Temporal importance extracted from both bipartite and unipartite networks, communities defined by genealogical relations, and the relative importance of temporal citation counts were used in the feature construction process. Both node and edge-level features were input into a recurrent neural network to forecast the feature values and predict the future relations between different scientific concepts/topics represented by the author selected keywords. High performance rates, compared both against contemporary heterogeneous network-based method and preferential attachment process, suggest that these features complement both the prediction of future literature-based discoveries and emerging trend analysis. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:23
相关论文
共 62 条
[1]  
Ahlers Caroline B, 2007, AMIA Annu Symp Proc, P6
[2]   powerlaw: A Python']Python Package for Analysis of Heavy-Tailed Distributions [J].
Alstott, Jeff ;
Bullmore, Edward T. ;
Plenz, Dietmar .
PLOS ONE, 2014, 9 (01)
[3]  
[Anonymous], 2005, Proceedings of the 6th Europeanconference on organizational knowledge, Learning, and Capabilities
[4]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512
[5]  
Bird S., 2009, NATURAL LANGUAGE PRO
[6]   Network science [J].
Borner, Katy ;
Sanyal, Soma ;
Vespignani, Alessandro .
ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 2007, 41 :537-607
[7]   Context-driven automatic subgraph creation for literature-based discovery [J].
Cameron, Delroy ;
Kavuluru, Ramakanth ;
Rindflesch, Thomas C. ;
Sheth, Amit P. ;
Thirunarayan, Krishnaprasad ;
Bodenreider, Olivier .
JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 54 :141-157
[8]   Literature-based automated discovery of tumor suppressor p53 phosphorylation and inhibition by NEK2 [J].
Choi, Byung-Kwon ;
Dayaram, Tajhal ;
Parikh, Neha ;
Wilkins, Angela D. ;
Nagarajan, Meena ;
Novikov, Ilya B. ;
Bachman, Benjamin J. ;
Jung, Sung Yun ;
Haas, Peter J. ;
Labrie, Jacques L. ;
Pickering, Curtis R. ;
Adikesavan, Anbu K. ;
Regenbogen, Sam ;
Kato, Linda ;
Lelescu, Ana ;
Buchovecky, Christie M. ;
Zhang, Houyin ;
Bao, Sheng Hua ;
Boyer, Stephen ;
Weber, Griff ;
Scott, Kenneth L. ;
Chen, Ying ;
Spangler, Scott ;
Donehower, Lawrence A. ;
Lichtarge, Olivier .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2018, 115 (42) :10666-10671
[9]   Time-aware link prediction to explore network effects on temporal knowledge evolution [J].
Choudhury, Nazim ;
Uddin, Shahadat .
SCIENTOMETRICS, 2016, 108 (02) :745-776
[10]   Neural networks for link prediction in realistic biomedical graphs: a multi-dimensional evaluation of graph embedding-based approaches [J].
Crichton, Gamal ;
Guo, Yufan ;
Pyysalo, Sampo ;
Korhonen, Anna .
BMC BIOINFORMATICS, 2018, 19