Discovering relations between indirectly connected biomedical concepts

被引:15
作者
Weissenborn, Dirk [1 ,2 ]
Schroeder, Michael [2 ]
Tsatsaronis, George [2 ]
机构
[1] DFKI Projektburo Berlin, D-10559 Berlin, Germany
[2] Tech Univ Dresden, Biotechnol Ctr, D-01307 Dresden, Germany
关键词
Relation discovery; Biomedical concepts; Text mining;
D O I
10.1186/s13326-015-0021-5
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The complexity and scale of the knowledge in the biomedical domain has motivated research work towards mining heterogeneous data from both structured and unstructured knowledge bases. Towards this direction, it is necessary to combine facts in order to formulate hypotheses or draw conclusions about the domain concepts. This work addresses this problem by using indirect knowledge connecting two concepts in a knowledge graph to discover hidden relations between them. The graph represents concepts as vertices and relations as edges, stemming from structured (ontologies) and unstructured (textual) data. In this graph, path patterns, i.e. sequences of relations, are mined using distant supervision that potentially characterize a biomedical relation. Results: It is possible to identify characteristic path patterns of biomedical relations from this representation using machine learning. For experimental evaluation two frequent biomedical relations, namely "has target", and "may treat", are chosen. Results suggest that relation discovery using indirect knowledge is possible, with an AUC that can reach up to 0.8, a result which is a great improvement compared to the random classification, and which shows that good predictions can be prioritized by following the suggested approach. Conclusions: Analysis of the results indicates that the models can successfully learn expressive path patterns for the examined relations. Furthermore, this work demonstrates that the constructed graph allows for the easy integration of heterogeneous information and discovery of indirect connections between biomedical concepts.
引用
收藏
页数:19
相关论文
共 28 条
[1]  
[Anonymous], 2006, P HLT NAACL BIONLP W
[2]  
[Anonymous], 2001, NIPS
[3]  
[Anonymous], 2004, P 17 INT C NEUR INF
[4]  
[Anonymous], 2006, P 12 ACM SIGKDD INT, DOI DOI 10.1145/1150402.1150492
[5]  
[Anonymous], 2012, P 2012 JOINT C EMP M
[6]  
Aronson AR, 2001, J AM MED INFORM ASSN, P17
[7]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[8]  
Bunescu R.C., 2005, P C HUM LANG TECHN E, DOI DOI 10.3115/1220575.1220666
[9]   Reflective Random Indexing and indirect inference: A scalable method for discovery of implicit connections [J].
Cohen, Trevor ;
Schvaneveldt, Roger ;
Widdows, Dominic .
JOURNAL OF BIOMEDICAL INFORMATICS, 2010, 43 (02) :240-256
[10]  
Craven M, 1999, Proc Int Conf Intell Syst Mol Biol, P77