EXPLORING A SUBGRAPH MATCHING APPROACH FOR EXTRACTING BIOLOGICAL EVENTS FROM LITERATURE

被引:3
作者
Liu, Haibin [1 ]
Keselj, Vlado [1 ]
Blouin, Christian [1 ]
机构
[1] Dalhousie Univ, Fac Comp Sci, Halifax, NS B3H 1W5, Canada
关键词
biological event extraction; biological information extraction; subgraph matching; subgraph isomorphism; PROTEIN-PROTEIN INTERACTIONS; NATURAL-LANGUAGE PARSERS; CORPUS; ALGORITHM;
D O I
10.1111/coin.12009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An important task in biological information extraction is to identify descriptions of biological relations and events involving genes or proteins. In this work, we propose a graph-based approach to automatically learn rules for detecting biological events in the life science literature. The event rules are learned by identifying the key contextual dependencies from full parsing of annotated text. The detection is performed by searching for isomorphism between event rules and the dependency graphs of complete sentences. When applying our approach to the data sets of the Task 1 of the BioNLP-ST 2009, we achieved a 40.71% F-score in detecting biological events across nine event types. Our 56.32% precision is comparable with the state-of-the-art systems. The approach may also be generalized to extract events from other domains where training data are available because it requires neither manual intervention nor external domain-specific resources. The subgraph matching algorithm we developed is released under the new BSD license and can be downloaded from http://esmalgorithm.sourceforge.net.
引用
收藏
页码:600 / 635
页数:36
相关论文
共 74 条
[1]   Biological relation extraction and query answering from MEDLINE abstracts using ontology-based text mining [J].
Abulaish, Muhammad ;
Dey, Lipika .
DATA & KNOWLEDGE ENGINEERING, 2007, 61 (02) :228-262
[2]  
Airola A., 2008, P WORKSH CURR TRENDS, P1, DOI DOI 10.3115/1572306.1572308
[3]   All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning [J].
Airola, Antti ;
Pyysalo, Sampo ;
Bjoerne, Jari ;
Pahikkala, Tapio ;
Ginter, Filip ;
Salakoski, Tapio .
BMC BIOINFORMATICS, 2008, 9 (Suppl 11)
[4]   Event extraction for systems biology by text mining the literature [J].
Ananiadou, Sophia ;
Pyysalo, Sampo ;
Tsujii, Jun'ichi ;
Kell, Douglas B. .
TRENDS IN BIOTECHNOLOGY, 2010, 28 (07) :381-390
[5]  
Ananiadou Sophia., 2005, Text Mining for Biology And Biomedicine
[6]  
[Anonymous], 2004, Introduction to Machine Learning
[7]  
[Anonymous], 2003, Computational discrete mathematics: combinatorics and graph theory with Mathematica, DOI DOI 10.1017/CBO9781139164849
[8]  
[Anonymous], 2006, BIONLP 06 P WORKSH L
[9]  
[Anonymous], 2008, COLING 2008 P WORKSH
[10]  
Ashburner M, 2001, GENOME RES, V11, P1425, DOI 10.1101/gr.180801