Onto-CC: a web server for identifying Gene Ontology conceptual clusters

被引:8
作者
Romero-Zaliz, R. [1 ]
del Val, C. [1 ]
Cobb, J. P. [2 ]
Zwir, I. [1 ,3 ]
机构
[1] Escuela Tecn Super Ingn Informat & Telecomun, Dept Ciencias Comp & Inteligencia Artificial, Granada 18071, Spain
[2] Washington Univ, Sch Med, Cellular Injury & Adaptat Lab, St Louis, MO USA
[3] Washington Univ, Sch Med, Howard Hughes Med Inst, Dept Mol Microbiol, St Louis, MO 63110 USA
关键词
D O I
10.1093/nar/gkn323
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The Gene Ontology (GO) vocabulary has been extensively explored to analyze the functions of coexpressed genes. However, despite its extended use in Biology and Medical Sciences, there are still high levels of uncertainty about which ontology (i.e. Molecular Process, Cellular Component or Molecular Function) should be used, and at which level of specificity. Moreover, the GO database can contain incomplete information resulting from human annotations, or highly influenced by the available knowledge about a specific branch in an ontology. In spite of these drawbacks, there is a trend to ignore these problems and even use GO terms to conduct searches of gene expression profiles (i.e. expression GO) instead of more cautious approaches that just consider them as an independent source of validation (i.e. expression versus GO). Consequently, propagating the uncertainty and producing biased analysis of the required gene grouping hypotheses. We proposed a web tool, Onto-CC, as an automatic method specially suited for independent explanation/validation of gene grouping hypotheses (e.g. coexpressed genes) based on GO clusters (i.e. expression versus GO). Onto-CC approach reduces the uncertainty of the queries by identifying optimal conceptual clusters that combine terms from different ontologies simultaneously, as well as terms defined at different levels of specificity in the GO hierarchy. To do so, we implemented the EMO-CC methodology to find clusters in structural databases [GO Directed acyclic Graph (DAG) tree], inspired on Conceptual Clustering algorithms. This approach allows the management of optimal cluster sets as potential parallel hypotheses, guided by multiobjective/multimodal optimization techniques. Therefore, we can generate alternative and, still, optimal explanations of queries that can provide new insights for a given problem. Onto-CC has been successfully used to test different medical and biological hypotheses including the explanation and prediction of gene expression profiles resulting from the host response to injuries in the inflammatory problem. Onto-CC provides two versions: Ready2GO, a precalculated EMO-CC for several genomes and an Advanced Onto-CC for custom annotation files (http://gps-tools2.wustl.edu/ontocc/index.html).
引用
收藏
页码:W352 / W357
页数:6
相关论文
共 37 条
[1]   FatiGO+:: a functional profiling tool for genomic data.: Integration of functional annotation, regulatory motifs and interaction data with microarray experiments [J].
Al-Shahrour, Fatima ;
Minguez, Pablo ;
Tarraga, Joaquin ;
Medina, Ignacio ;
Alloza, Eva ;
Montaner, David ;
Dopazo, Joaquin .
NUCLEIC ACIDS RESEARCH, 2007, 35 :W91-W96
[2]   The Candida Genome Database (CGD), a community resource for Candida albicans gene and protein information [J].
Arnaud, MB ;
Costanzo, MC ;
Skrzypek, MS ;
Binkley, G ;
Lane, C ;
Miyasato, SR ;
Sherlock, G .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D358-D363
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   The Vertebrate Genome Annotation (Vega) database [J].
Ashurst, JL ;
Chen, CK ;
Gilbert, JGR ;
Jekosch, K ;
Keenan, S ;
Meidl, P ;
Searle, SM ;
Stalker, J ;
Storey, R ;
Trevanion, S ;
Wilming, L ;
Hubbard, T .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D459-D465
[5]   The universal protein resource (UniProt) [J].
Bairoch, Amos ;
Bougueleret, Lydie ;
Altairac, Severine ;
Amendolia, Valeria ;
Auchincloss, Andrea ;
Puy, Ghislaine Argoud ;
Axelsen, Kristian ;
Baratin, Delphine ;
Blatter, Marie-Claude ;
Boeckmann, Brigitte ;
Bollondi, Laurent ;
Boutet, Emmanuel ;
Quintaje, Silvia Braconi ;
Breuza, Lionel ;
Bridge, Alan ;
deCastro, Edouard ;
Coral, Danielle ;
Coudert, Elisabeth ;
Cusin, Isabelle ;
Dobrokhotov, Pavel ;
Dornevil, Dolnide ;
Duvaud, Severine ;
Estreicher, Anne ;
Famiglietti, Livia ;
Feuermann, Marc ;
Gehant, Sebastian ;
Farriol-Mathis, Nathalie ;
Ferro, Serenella ;
Gasteiger, Elisabeth ;
Gateau, Alain ;
Gerritsen, Vivienne ;
Gos, Arnaud ;
Gruaz-Gumowski, Nadine ;
Hinz, Ursula ;
Hulo, Chantal ;
Hulo, Nicolas ;
Ioannidis, Vassilios ;
Ivanyi, Ivan ;
James, Janet ;
Jain, Eric ;
Jimenez, Silvia ;
Jungo, Florence ;
Junker, Vivien ;
Keller, Guillaume ;
Lachaize, Corinne ;
Lane-Guermonprez, Lydie ;
Langendijk-Genevaux, Petra ;
Lara, Vicente ;
Lemercier, Philippe ;
Le Saux, Virginie .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D193-D197
[6]   Interpreting experimental results using gene ontologies [J].
Beissbarth, Tim .
DNA MICROARRAYS, PART B: DATABASES AND STATISTICS, 2006, 411 :340-352
[7]  
Benson DA, 2010, NUCLEIC ACIDS RES, V38, pD46, DOI [10.1093/nar/gkp1024, 10.1093/nar/gkq1079, 10.1093/nar/gkl986, 10.1093/nar/gks1195, 10.1093/nar/gkw1070, 10.1093/nar/gkr1202, 10.1093/nar/gkn723, 10.1093/nar/gkx1094]
[8]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[9]   WormBase:: new content and better access [J].
Bieri, Tamberlyn ;
Blasiar, Darin ;
Ozersky, Philip ;
Antoshechkin, Igor ;
Bastiani, Carol ;
Canaran, Payan ;
Chan, Juancarlos ;
Chen, Nansheng ;
Chen, Wen J. ;
Davis, Paul ;
Fiedler, Tristan J. ;
Girard, Lisa ;
Han, Michael ;
Harris, Todd W. ;
Kishore, Ranjana ;
Lee, Raymond ;
McKay, Sheldon ;
Muller, Hans-Michael ;
Nakamura, Cecilia ;
Petcherski, Andrei ;
Rangarajan, Arun ;
Rogers, Anthony ;
Schindelman, Gary ;
Schwarz, Erich M. ;
Spooner, Will ;
Tuli, Mary Ann ;
Van Auken, Kimberly ;
Wang, Daniel ;
Wang, Xiaodong ;
Williams, Gary ;
Durbin, Richard ;
Stein, Lincoln D. ;
Sternberg, Paul W. ;
Spieth, John .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D506-D510
[10]   A network-based analysis of systemic inflammation in humans [J].
Calvano, SE ;
Xiao, WZ ;
Richards, DR ;
Felciano, RM ;
Baker, HV ;
Cho, RJ ;
Chen, RO ;
Brownstein, BH ;
Cobb, JP ;
Tschoeke, SK ;
Miller-Graziano, C ;
Moldawer, LL ;
Mindrinos, MN ;
Davis, RW ;
Tompkins, RG ;
Lowry, SF .
NATURE, 2005, 437 (7061) :1032-1037