A multi-ontology approach to annotate scientific documents based on a modularization technique

被引:1
作者
Correa e Castro Gomes, Priscilla
de Carvalho Moura, Ana Maria
Claudia Cavalcanti, Maria
机构
[1] Instituto Militar de Engenharia, Brazil
[2] Laboratório Nacional de Computação Científica, Brazil
关键词
Biomedical ontologies; Ontology modularization; Text annotation; SEMANTIC WEB; TEXT;
D O I
10.1016/j.jbi.2015.09.022
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Scientific text annotation has become an important task for biomedical scientists. Nowadays, there is an increasing need for the development of intelligent systems to support new scientific findings. Public data-bases available on the Web provide useful data, but much more useful information is only accessible in scientific texts. Text annotation may help as it relies on the use of ontologies to maintain annotations based on a uniform vocabulary. However, it is difficult to use an ontology, especially those that cover a large domain. In addition, since scientific texts explore multiple domains, which are covered by distinct ontologies, it becomes even more difficult to deal with such task. Moreover, there are dozens of ontologies in the biomedical area, and they are usually big in terms of the number of concepts. It is in this context that ontology modularization can be useful. This work presents an approach to annotate scientific documents using modules of different ontologies, which are built according to a module extraction technique. The main idea is to analyze a set of single-ontology annotations on a text to find out the user interests. Based on these annotations a set of modules are extracted from a set of distinct ontologies, and are made available for the user, for complementary annotation. The reduced size and focus of the extracted modules tend to facilitate the annotation task. An experiment was conducted to evaluate this approach, with the participation of a bioinformatician specialist of the Laboratory of Peptides and Proteins of the IOC/Fiocruz, who was interested in discovering new drug targets aiming at the combat of tropical diseases. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:208 / 219
页数:12
相关论文
共 43 条
[1]  
[Anonymous], 1999, Modern Information Retrieval
[2]  
[Anonymous], CEUR WORKSHOP P
[3]  
[Anonymous], HUM LANG TECHN WORKS
[4]   An overview of MetaMap: historical perspective and recent advances [J].
Aronson, Alan R. ;
Lang, Francois-Michel .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (03) :229-236
[5]  
Belloze K.T., 2011, P 10 M 7 INT C BRAZ
[6]  
Belloze K.T., 2012, CEUR WORKSHOP P, V897
[7]   The Semantic Web - A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities [J].
Berners-Lee, T ;
Hendler, J ;
Lassila, O .
SCIENTIFIC AMERICAN, 2001, 284 (05) :34-+
[8]  
Bodenreider O., 2012, CEUR WORKSHOP P, V833
[9]  
Castro Gomes P.C. e, 2012, THESIS
[10]   Using text to build semantic networks for pharmacogenomics [J].
Coulet, Adrien ;
Shah, Nigam H. ;
Garten, Yael ;
Musen, Mark ;
Altman, Russ B. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2010, 43 (06) :1009-1019