Statistical analysis of genomic protein family and domain controlled annotations for functional investigation of classified gene lists

被引:3
作者
Masseroli, Marco
Bellistri, Elisa
Franceschini, Andrea
Pinciroli, Francesco
机构
[1] Politecn Milan, Dipartimento Elettron & Informaz, I-20133 Milan, Italy
[2] Politecn Milan, Dipartimento Bioingn, BioMed Informat Lab, I-20133 Milan, Italy
关键词
D O I
10.1186/1471-2105-8-S1-S14
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The increasing protein family and domain based annotations constitute important information to understand protein functions and gain insight into relations among their codifying genes. To allow analyzing of gene proteomic annotations, we implemented novel modules within GFINDer, a Web system we previously developed that dynamically aggregates functional and phenotypic annotations of user-uploaded gene lists and allows performing their statistical analysis and mining. Results: Exploiting protein information in Pfam and InterPro databanks, we developed and added in GFINDer original modules specifically devoted to the exploration and analysis of functional signatures of gene protein products. They allow annotating numerous user-classified nucleotide sequence identifiers with controlled information on related protein families, domains and functional sites, classifying them according to such protein annotation categories, and statistically analyzing the obtained classifications. In particular, when uploaded nucleotide sequence identifiers are subdivided in classes, the Statistics Protein Families&Domains module allows estimating relevance of Pfam or InterPro controlled annotations for the uploaded genes by highlighting protein signatures significantly more represented within user-defined classes of genes. In addition, the Logistic Regression module allows identifying protein functional signatures that better explain the considered gene classification. Conclusion: Novel GFINDer modules provide genomic protein family and domain analyses supporting better functional interpretation of gene classes, for instance defined through statistical and clustering analyses of gene expression results from microarray experiments. They can hence help understanding fundamental biological processes and complex cellular mechanisms influenced by protein domain composition, and contribute to unveil new biomedical knowledge about the codifying genes.
引用
收藏
页数:10
相关论文
共 24 条
[1]   BABELOMICS: a suite of web tools for functional annotation and analysis of groups of genes in high-throughput experiments [J].
Al-Shahrour, F ;
Minguez, P ;
Vaquerizas, JM ;
Conde, L ;
Dopazo, J .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W460-W464
[2]  
[Anonymous], 1989, Applied Logistic Regression
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   Mutational analysis of the tyrosine kinome in colorectal cancers [J].
Bardelli, A ;
Parsons, DW ;
Silliman, N ;
Ptak, J ;
Szabo, S ;
Saha, S ;
Markowitz, S ;
Willson, JKV ;
Parmigiani, G ;
Kinzler, KW ;
Vogelstein, B ;
Velculescu, VE .
SCIENCE, 2003, 300 (5621) :949-949
[5]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[6]   Neurogenomics: at the intersection of neurobiology and genome sciences [J].
Boguski, MS ;
Jones, AR .
NATURE NEUROSCIENCE, 2004, 7 (05) :429-433
[7]  
Bonferroni C., 1936, PUBBLICAZIONI R I SU, V8, P3, DOI DOI 10.4135/9781412961288.N455
[8]  
Casella G., 2002, STAT INFERENCE
[9]   Protein domain analysis in the era of complete genomes [J].
Copley, RR ;
Doerks, T ;
Letunic, I ;
Bork, P .
FEBS LETTERS, 2002, 513 (01) :129-134
[10]   DAVID: Database for annotation, visualization, and integrated discovery [J].
Dennis, G ;
Sherman, BT ;
Hosack, DA ;
Yang, J ;
Gao, W ;
Lane, HC ;
Lempicki, RA .
GENOME BIOLOGY, 2003, 4 (09)