Bayesian model accounting for within-class biological variability in Serial Analysis of Gene Expression (SAGE) -: art. no. 119

被引:50
作者
Vêncio, RZN
Brentani, H
Patrao, DFC
Pereira, CAB
机构
[1] Univ Sao Paulo, Dept Stat, Inst Matemat & Estat, BR-05508090 Sao Paulo, Brazil
[2] Univ Sao Paulo, BIOINFO USP, Nucleo Pesquisas & Bioinformat, BR-05508090 Sao Paulo, Brazil
[3] Ludwig Inst Canc Res, Sao Paulo Branch, BR-01519010 Sao Paulo, Brazil
[4] Hosp Canc AC Camargo, BR-01519010 Sao Paulo, Brazil
关键词
D O I
10.1186/1471-2105-5-119
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: An important challenge for transcript counting methods such as Serial Analysis of Gene Expression (SAGE), "Digital Northern" or Massively Parallel Signature Sequencing (MPSS), is to carry out statistical analyses that account for the within-class variability, i.e., variability due to the intrinsic biological differences among sampled individuals of the same class, and not only variability due to technical sampling error. Results: We introduce a Bayesian model that accounts for the within-class variability by means of mixture distribution. We show that the previously available approaches of aggregation in pools ("pseudo-libraries") and the Beta-Binomial model, are particular cases of the mixture model. We illustrate our method with a brain tumor vs. normal comparison using SAGE data from public databases. We show examples of tags regarded as differentially expressed with high significance if the within-class variability is ignored, but clearly not so significant if one accounts for it. Conclusion: Using available information about biological replicates, one can transform a list of candidate transcripts showing differential expression to a more reliable one. Our method is freely available, under GPL/GNU copyleft, through a user friendly web-based on-line tool or as R language scripts at supplemental web-site.
引用
收藏
页数:13
相关论文
共 24 条
[1]  
AITCHISON J, 1985, J ROY STAT SOC B MET, V47, P136
[2]  
Aitchison J., 1975, Statistical Prediction Analysis
[3]   Correction of sequence-based artifacts in serial analysis of gene expression [J].
Akmaev, VR ;
Wang, CJ .
BIOINFORMATICS, 2004, 20 (08) :1254-1263
[4]  
[Anonymous], 2003, GENOME BIOL
[5]   The significance of digital gene expression profiles [J].
Audic, S ;
Claverie, JM .
GENOME RESEARCH, 1997, 7 (10) :986-995
[6]   Differential expression in SAGE: accounting for normal between-library variation [J].
Baggerly, KA ;
Deng, L ;
Morris, JS ;
Aldaz, CM .
BIOINFORMATICS, 2003, 19 (12) :1477-1483
[7]  
BLADES NJ, IN PRESS BIOINFORMAT
[8]   An anatomy of normal and malignant gene expression [J].
Boon, K ;
Osório, EC ;
Greenhut, SF ;
Schaefer, CF ;
Shoemaker, J ;
Polyak, K ;
Morin, PJ ;
Buetow, KH ;
Strausberg, RL ;
de Souza, SJ ;
Riggins, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (17) :11287-11292
[9]   Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays [J].
Brenner, S ;
Johnson, M ;
Bridgham, J ;
Golda, G ;
Lloyd, DH ;
Johnson, D ;
Luo, SJ ;
McCurdy, S ;
Foy, M ;
Ewan, M ;
Roth, R ;
George, D ;
Eletr, S ;
Albrecht, G ;
Vermaas, E ;
Williams, SR ;
Moon, K ;
Burcham, T ;
Pallas, M ;
DuBridge, RB ;
Kirchner, J ;
Fearon, K ;
Mao, J ;
Corcoran, K .
NATURE BIOTECHNOLOGY, 2000, 18 (06) :630-634
[10]   Enviromental genotoxicity evaluation: Bayesian approach for a mixture statistical model [J].
Bueno, AMD ;
Pereira, CAD ;
Rabello-Gay, MN ;
Stern, JM .
STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2002, 16 (04) :267-278