Combining literature text mining with microarray data: advances for system biology modeling

被引:37
作者
Faro, Alberto [1 ]
Giordano, Daniela [1 ]
Spampinato, Concetto [1 ]
机构
[1] Univ Catania, Fac Engn, I-95124 Catania, Italy
关键词
literature text mining; microarray data; biological databases; knowledge discovery; GENE-EXPRESSION; REGULATORY NETWORKS; RELATION EXTRACTION; SEARCH ENGINE; PROTEIN; TOOL; NAME; IDENTIFICATION; COOCCURRENCE; INFORMATION;
D O I
10.1093/bib/bbr018
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A huge amount of important biomedical information is hidden in the bulk of research articles in biomedical fields. At the same time, the publication of databases of biological information and of experimental datasets generated by high-throughput methods is in great expansion, and a wealth of annotated gene databases, chemical, genomic (including microarray datasets), clinical and other types of data repositories are now available on the Web. Thus a current challenge of bioinformatics is to develop targeted methods and tools that integrate scientific literature, biological databases and experimental data for reducing the time of database curation and for accessing evidence, either in the literature or in the datasets, useful for the analysis at hand. Under this scenario, this article reviews the knowledge discovery systems that fuse information from the literature, gathered by text mining, with microarray data for enriching the lists of down and upregulated genes with elements for biological understanding and for generating and validating new biological hypothesis. Finally, an easy to use and freely accessible tool, GeneWizard, that exploits text mining and microarray data fusion for supporting researchers in discovering gene-disease relationships is described.
引用
收藏
页码:61 / 82
页数:22
相关论文
共 114 条
[71]   Inference of combinatorial Boolean rules of synergistic gene sets from cancer microarray datasets [J].
Park, Inho ;
Lee, Kwang H. ;
Lee, Doheon .
BIOINFORMATICS, 2010, 26 (12) :1506-1512
[72]   G2D: a tool for mining genes associated with disease [J].
Perez-Iratxeta, C ;
Wjst, M ;
Bork, P ;
Andrade, MA .
BMC GENETICS, 2005, 6 (1)
[73]   XplorMed: a tool for exploring MEDLINE abstracts [J].
Perez-Iratxeta, C ;
Bork, P ;
Andrade, MA .
TRENDS IN BIOCHEMICAL SCIENCES, 2001, 26 (09) :573-575
[74]   Update of the G2D tool for prioritization of gene candidates to inherited diseases [J].
Perez-Iratxeta, Carolina ;
Bork, Peer ;
Andrade-Navarro, Miguel A. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :W212-W216
[75]   Identifying regulatory networks by combinatorial analysis of promoter elements [J].
Pilpel, Y ;
Sudarsanam, P ;
Church, GM .
NATURE GENETICS, 2001, 29 (02) :153-159
[76]   NCBI Reference Sequence Project: update and current status [J].
Pruitt, KD ;
Tatusova, T ;
Maglott, DR .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :34-37
[77]   Ontology-centric integration and navigation of the dengue literature [J].
Rajapakse, Menaka ;
Kanagasabai, Rajaraman ;
Ang, Wee Tiong ;
Veeramani, Anitha ;
Schreiber, Mark J. ;
Baker, Christopher J. O. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2008, 41 (05) :806-815
[78]   Text processing through web services: calling Whatizit [J].
Rebholz-Schuhmann, Dietrich ;
Arregui, Miguel ;
Gaudan, Sylvain ;
Kirsch, Harald ;
Jimeno, Antonio .
BIOINFORMATICS, 2008, 24 (02) :296-298
[79]   EBIMed - text crunching to gather facts for proteins from Medline [J].
Rebholz-Schuhmann, Dietrich ;
Kirsch, Harald ;
Arregui, Miguel ;
Gaudan, Sylvain ;
Riethoven, Mark ;
Stoehr, Peter .
BIOINFORMATICS, 2007, 23 (02) :E237-E244
[80]   Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach [J].
Rinaldi, Fabio ;
Schneider, Gerold ;
Kaljurand, Kaarel ;
Hess, Michael ;
Andronis, Christos ;
Konstandi, Ourania ;
Persidis, Andreas .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2007, 39 (02) :127-136