An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data

被引:15
作者
Jenkinson, Garrett [1 ,2 ]
Abante, Jordi [1 ]
Feinberg, Andrew P. [2 ,3 ,4 ]
Goutsias, John [1 ]
机构
[1] Johns Hopkins Univ, Whitaker Biomed Engn Inst, Baltimore, MD 21218 USA
[2] Johns Hopkins Sch Med, Ctr Epigenet, Baltimore, MD USA
[3] Johns Hopkins Univ, Dept Biomed Engn, Baltimore, MD USA
[4] Johns Hopkins Sch Med, Dept Med, Baltimore, MD USA
来源
BMC BIOINFORMATICS | 2018年 / 19卷
关键词
DNA methylation; Genome analysis; Information theory; Ising model; Methylation analysis; WGBS data modeling and analysis; DIFFERENTIALLY METHYLATED REGIONS; FALSE DISCOVERY RATE; DNA METHYLATION; CPG ISLANDS; OPTIMIZATION; POWERFUL; GENES;
D O I
10.1186/s12859-018-2086-5
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: DNA methylation is a stable form of epigenetic memory used by cells to control gene expression. Whole genome bisulfite sequencing (WGBS) has emerged as a gold-standard experimental technique for studying DNA methylation by producing high resolution genome-wide methylation profiles. Statistical modeling and analysis is employed to computationally extract and quantify information from these profiles in an effort to identify regions of the genome that demonstrate crucial or aberrant epigenetic behavior. However, the performance of most currently available methods for methylation analysis is hampered by their inability to directly account for statistical Results: We present a powerful information-theoretic approach for genome-wide modeling and analysis of WGBS data based on the 1D Ising model of statistical physics. This approach takes into account correlations in methylation by utilizing a joint probability model that encapsulates all information available in WGBS methylation reads and produces accurate results even when applied on single WGBS samples with low coverage. Using the Shannon entropy, our approach provides a rigorous quantification of methylation stochasticity in individual WGBS samples genome-wide. Furthermore, it utilizes the Jensen-Shannon distance to evaluate differences in methylation distributions between a test and a reference sample. Differential performance assessment using simulated and real human lung normal/cancer data demonstrate a clear superiority of our approach over DSS, a recently proposed method for WGBS data analysis. Critically, these results demonstrate that marginal methods become statistically invalid when correlations are present in the data. Conclusions: This contribution demonstrates clear benefits and the necessity of modeling joint probability distributions of methylation using the 1D Ising model of statistical physics and of quantifying methylation stochasticity using concepts from information theory. By employing this methodology, substantial improvement of DNA methylation analysis can be achieved by effectively taking into account the massive amount of statistical information available in WGBS data, which is largely ignored by existing methods.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data
    Wang, Ting
    Guan, Weihua
    Lin, Jerome
    Boutaoui, Nadia
    Canino, Glorisa
    Luo, Jianhua
    Celedon, Juan Carlos
    Chen, Wei
    [J]. EPIGENETICS, 2015, 10 (07) : 662 - 669
  • [32] An Information-theoretic approach for computational material modeling
    Furukawa, Tomonari
    Michopoulos, John G.
    [J]. ADVANCES IN FRACTURE AND MATERIALS BEHAVIOR, PTS 1 AND 2, 2008, 33-37 : 857 - +
  • [33] Genome-wide analysis of DNA Methylation profiles on sheep ovaries associated with prolificacy using whole-genome Bisulfite sequencing
    Yanli Zhang
    Fengzhe Li
    Xu Feng
    Hua Yang
    Aoxiang Zhu
    Jing Pang
    Le Han
    Tingting Zhang
    Xiaolei Yao
    Feng Wang
    [J]. BMC Genomics, 18
  • [34] Analysis of DNA methylation profiles during sheep skeletal muscle development using whole-genome bisulfite sequencing
    Yixuan Fan
    Yaxu Liang
    Kaiping Deng
    Zhen Zhang
    Guomin Zhang
    Yanli Zhang
    Feng Wang
    [J]. BMC Genomics, 21
  • [35] Analysis of DNA Methylation Profiles in Mandibular Condyle of Chicks With Crossed Beaks Using Whole-Genome Bisulfite Sequencing
    Shi, Lei
    Bai, Hao
    Li, Yunlei
    Yuan, Jingwei
    Wang, Panlin
    Wang, Yuanmei
    Ni, Aixin
    Jiang, Linlin
    Ge, Pingzhuang
    Bian, Shixiong
    Zong, Yunhe
    Isa, Adamu Mani
    Tesfay, Hailai Hagos
    Yang, Fujian
    Ma, Hui
    Sun, Yanyan
    Chen, Jilan
    [J]. FRONTIERS IN GENETICS, 2021, 12
  • [36] Analysis of DNA methylation profiles during sheep skeletal muscle development using whole-genome bisulfite sequencing
    Fan, Yixuan
    Liang, Yaxu
    Deng, Kaiping
    Zhang, Zhen
    Zhang, Guomin
    Zhang, Yanli
    Wang, Feng
    [J]. BMC GENOMICS, 2020, 21 (01)
  • [37] Whole-Genome Bisulfite Sequencing (WGBS) Analysis of Gossypium hirsutum under High-Temperature Stress Conditions
    Gong, Zhaolong
    Zheng, Juyun
    Yang, Ni
    Li, Xueyuan
    Qian, Shuaishuai
    Sun, Fenglei
    Geng, Shiwei
    Liang, Yajun
    Wang, Junduo
    [J]. GENES, 2024, 15 (10)
  • [38] Comparison of whole-genome bisulfite sequencing library preparation strategies identifies sources of biases affecting DNA methylation data
    Nelly Olova
    Felix Krueger
    Simon Andrews
    David Oxley
    Rebecca V. Berrens
    Miguel R. Branco
    Wolf Reik
    [J]. Genome Biology, 19
  • [39] Comparison of whole-genome bisulfite sequencing library preparation strategies identifies sources of biases affecting DNA methylation data
    Olova, Nelly
    Krueger, Felix
    Andrews, Simon
    Oxley, David
    Berrens, Rebecca, V
    Branco, Miguel R.
    Reik, Wolf
    [J]. GENOME BIOLOGY, 2018, 19
  • [40] Methy-Pipe: An Integrated Bioinformatics Pipeline for Whole Genome Bisulfite Sequencing Data Analysis
    Jiang, Peiyong
    Sun, Kun
    Lun, Fiona M. F.
    Guo, Andy M.
    Wang, Huating
    Chan, K. C. Allen
    Chiu, Rossa W. K.
    Lo, Y. M. Dennis
    Sun, Hao
    [J]. PLOS ONE, 2014, 9 (06):