共 44 条
Modelling the conditional regulatory activity of methylated and bivalent promoters
被引:6
作者:
Budden, David M.
[1
,2
]
Hurley, Daniel G.
[1
]
Crampin, Edmund J.
[1
,2
,3
,4
,5
]
机构:
[1] Univ Melbourne, Melbourne Sch Engn, Syst Biol Lab, Parkville, Vic 3010, Australia
[2] Univ Melbourne, NICTA Victoria Res Lab, Parkville, Vic 3010, Australia
[3] ARC Ctr Excellence Convergent Bionano Sci & Techn, Parkville, Vic 3010, Australia
[4] Univ Melbourne, Dept Math & Stat, Parkville, Vic 3010, Australia
[5] Univ Melbourne, Sch Med, Parkville, Vic 3010, Australia
基金:
澳大利亚研究理事会;
关键词:
TRANSCRIPTION FACTOR-BINDING;
DNA METHYLATION;
GENOME-WIDE;
GENE-EXPRESSION;
HISTONE MODIFICATIONS;
CHROMATIN;
H2A.Z;
DYNAMICS;
VARIANT;
REPAIR;
D O I:
10.1186/s13072-015-0013-9
中图分类号:
Q3 [遗传学];
学科分类号:
071007 ;
090102 ;
摘要:
Background: Predictive modelling of gene expression is a powerful framework for the in silico exploration of transcriptional regulatory interactions through the integration of high-throughput -omics data. A major limitation of previous approaches is their inability to handle conditional interactions that emerge when genes are subject to different regulatory mechanisms. Although chromatin immunoprecipitation-based histone modification data are often used as proxies for chromatin accessibility, the association between these variables and expression often depends upon the presence of other epigenetic markers (e.g. DNA methylation or histone variants). These conditional interactions are poorly handled by previous predictive models and reduce the reliability of downstream biological inference. Results: We have previously demonstrated that integrating both transcription factor and histone modification data within a single predictive model is rendered ineffective by their statistical redundancy. In this study, we evaluate four proposed methods for quantifying gene-level DNA methylation levels and demonstrate that inclusion of these data in predictive modelling frameworks is also subject to this critical limitation in data integration. Based on the hypothesis that statistical redundancy in epigenetic data is caused by conditional regulatory interactions within a dynamic chromatin context, we construct a new gene expression model which is the first to improve prediction accuracy by unsupervised identification of latent regulatory classes. We show that DNA methylation and H2A.Z histone variant data can be interpreted in this way to identify and explore the signatures of silenced and bivalent promoters, substantially improving genome-wide predictions of mRNA transcript abundance and downstream biological inference across multiple cell lines. Conclusions: Previous models of gene expression have been applied successfully to several important problems in molecular biology, including the discovery of transcription factor roles, identification of regulatory elements responsible for differential expression patterns and comparative analysis of the transcriptome across distant species. Our analysis supports our hypothesis that statistical redundancy in epigenetic data is partially due to conditional relationships between these regulators and gene expression levels. This analysis provides insight into the heterogeneous roles of H3K4me3 and H3K27me3 in the presence of the H2A.Z histone variant (implicated in cancer progression) and how these signatures change during lineage commitment and carcinogenesis.
引用
收藏
页数:10
相关论文