A statistical model for investigating binding probabilities of DNA nucleotide sequences using microarrays

被引:21
作者
Lee, MLT
Bulyk, ML
Whitmore, GA
Church, GM
机构
[1] Brigham & Womens Hosp, Channing Lab, Boston, MA 02115 USA
[2] Harvard Univ, Sch Med, Boston, MA 02115 USA
[3] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[4] Harvard Univ, Sch Med, Dept Genet, Boston, MA 02115 USA
[5] McGill Univ, Montreal, PQ H3A 1G5, Canada
关键词
binding probability; dependence; log-linear model; log-probability model; microarrays; multistage ANOVA; protein; transcription factor;
D O I
10.1111/j.0006-341X.2002.00981.x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
There is considerable scientific interest in knowing the probability that a site-specific transcription factor will bind to a given DNA sequence. Microarray methods provide an effective means for assessing the binding affinities of a large number of DNA sequences as demonstrated by Bulyk et al. (2001, Proceedings of the National Academy of Sciences, USA 98, 7158-7163) in their study of the DNA-binding specificities of Zif268 zinc fingers using microarray technology. In a follow-up investigation, Bulyk, Johnson, and Church (2002, Nucleic Acid Research 30, 1255-1261) studied the interdependence of nucleotides on the binding affinities of transcription proteins. Our article is motivated by this pair of studies. We present a general statistical methodology for analyzing microarray intensity measurements reflecting DNA-protein interactions. The log probability of a protein binding to a DNA sequence on an array is modeled using a linear ANOVA model. This model is convenient because it employs familiar statistical concepts and procedures and also because it is effective for investigating the probability structure of the binding mechanism.
引用
收藏
页码:981 / 988
页数:8
相关论文
共 13 条