GO PaD: The gene ontology partition database

被引:43
作者
Alterovitz, Gil
Xiang, Michael
Mohan, Mamta
Ramoni, Marco F.
机构
[1] Harvard Univ, Sch Med, Div Hlth Sci & Technol, Boston, MA 02115 USA
[2] MIT, Dept Elect Engn & Comp Sci, Boston, MA 02115 USA
[3] MIT, Dept Biol, Boston, MA 02115 USA
[4] Childrens Hosp, Informat Program, Boston, MA 02115 USA
[5] Harvard Univ, Sch Med, Harvard Partners Ctr Genet & Genom, Boston, MA 02115 USA
关键词
D O I
10.1093/nar/gkl799
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Gene Ontology (GO) has been widely used to infer functional significance associated with sets of genes in order to automate discoveries within large-scale genetic studies. A level in GO's direct acyclic graph structure is often assumed to be indicative of its terms' specificities, although other work has suggested this assumption does not hold. Unfortunately, quantitative analysis of biological functions based on nodes at the same level (as is common in gene enrichment analysis tools) can lead to incorrect conclusions as well as missed discoveries due to inefficient use of available information. This paper addresses these using an informational theoretic approach encoded in the GO Partition Database that guarantees to maximize information for gene enrichment analysis. The GO Partition Database was designed to feature ontology partitions with GO terms of similar specificity. The GO partitions comprise varying numbers of nodes and present relevant information theoretic statistics, so researchers can choose to analyze datasets at arbitrary levels of specificity. The GO Partition Database, featuring GO partition sets for functional analysis of genes from human and 10 other commonly studied organisms bcl.med.harvard.edu/proj/gopart. The site also includes an online tutorial.
引用
收藏
页码:D322 / D327
页数:6
相关论文
共 13 条
[1]   FatiGO:: a web tool for finding significant associations of Gene Ontology terms with groups of genes [J].
Al-Shahrour, F ;
Díaz-Uriarte, R ;
Dopazo, J .
BIOINFORMATICS, 2004, 20 (04) :578-580
[2]   YPD™, PombePD™ and WormPD™:: model organism volumes of the BioKnowledge™ Library, an integrated resource for protein information [J].
Costanzo, MC ;
Crawford, ME ;
Hirschman, JE ;
Kranz, JE ;
Olsen, P ;
Robertson, LS ;
Skrzypek, MS ;
Braun, BR ;
Hopkins, KL ;
Kondu, P ;
Lengieza, C ;
Lew-Smith, JE ;
Tillberg, M ;
Garrels, JI .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :75-79
[3]  
Cover TM, 2006, Elements of Information Theory
[4]   DAVID: Database for annotation, visualization, and integrated discovery [J].
Dennis, G ;
Sherman, BT ;
Hosack, DA ;
Yang, J ;
Gao, W ;
Lane, HC ;
Lempicki, RA .
GENOME BIOLOGY, 2003, 4 (09)
[5]   The Gene Ontology (GO) database and informatics resource [J].
Harris, MA ;
Clark, J ;
Ireland, A ;
Lomax, J ;
Ashburner, M ;
Foulger, R ;
Eilbeck, K ;
Lewis, S ;
Marshall, B ;
Mungall, C ;
Richter, J ;
Rubin, GM ;
Blake, JA ;
Bult, C ;
Dolan, M ;
Drabkin, H ;
Eppig, JT ;
Hill, DP ;
Ni, L ;
Ringwald, M ;
Balakrishnan, R ;
Cherry, JM ;
Christie, KR ;
Costanzo, MC ;
Dwight, SS ;
Engel, S ;
Fisk, DG ;
Hirschman, JE ;
Hong, EL ;
Nash, RS ;
Sethuraman, A ;
Theesfeld, CL ;
Botstein, D ;
Dolinski, K ;
Feierbach, B ;
Berardini, T ;
Mundodi, S ;
Rhee, SY ;
Apweiler, R ;
Barrell, D ;
Camon, E ;
Dimmer, E ;
Lee, V ;
Chisholm, R ;
Gaudet, P ;
Kibbe, W ;
Kishore, R ;
Schwarz, EM ;
Sternberg, P ;
Gwinn, M .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D258-D261
[6]   The Gene Ontology (GO) project in 2006 [J].
Harris, Midori A. ;
Clark, Jennifer I. ;
Ireland, Amelia ;
Lomax, Jane ;
Ashburner, Michael ;
Collins, Russell ;
Eilbeck, Karen ;
Lewis, Suzanna ;
Mungall, Chris ;
Richter, John ;
Rubin, Gerald M. ;
Shu, ShengQiang ;
Blake, Judith A. ;
Bult, Carol J. ;
Diehl, Alexander D. ;
Dolan, Mary E. ;
Drabkin, Harold J. ;
Eppig, Janan T. ;
Hill, David P. ;
Ni, Li ;
Ringwald, Martin ;
Balakrishnan, Rama ;
Binkley, Gail ;
Cherry, J. Michael ;
Christie, Karen R. ;
Costanzo, Maria C. ;
Dong, Qing ;
Engel, Stacia R. ;
Fisk, Dianna G. ;
Hirschman, Jodi E. ;
Hitz, Benjamin C. ;
Hong, Eurie L. ;
Lane, Christopher ;
Miyasato, Stuart ;
Nash, Robert ;
Sethuraman, Anand ;
Skrzypek, Marek ;
Theesfeld, Chandra L. ;
Weng, Shuai ;
Botstein, David ;
Dolinski, Kara ;
Oughtred, Rose ;
Berardini, Tanya ;
Mundodi, Suparna ;
Rhee, Seung Y. ;
Apweiler, Rolf ;
Barrell, Daniel ;
Camon, Evelyn ;
Dimmer, Emily ;
Mulder, Nicola .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D322-D326
[7]   EcoCyc:: a comprehensive database resource for Escherichia coli [J].
Keseler, IM ;
Collado-Vides, J ;
Gama-Castro, S ;
Ingraham, J ;
Paley, S ;
Paulsen, IT ;
Peralta-Gill, M ;
Karp, PD .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D334-D337
[8]  
MacKay D, 2003, Information Theory, Inference, and Learning Algorithms
[9]   MIPS:: analysis and annotation of proteins from whole genomes [J].
Mewes, HW ;
Amid, C ;
Arnold, R ;
Frishman, D ;
Güldener, U ;
Mannhaupt, G ;
Münsterkötter, M ;
Pagel, P ;
Strack, N ;
Stümpflen, V ;
Warfsmann, J ;
Ruepp, A .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D41-D44
[10]   A MATHEMATICAL THEORY OF COMMUNICATION [J].
SHANNON, CE .
BELL SYSTEM TECHNICAL JOURNAL, 1948, 27 (04) :623-656