A Bayesian mixture model for chromatin interaction data

被引:4
|
作者
Niu, Liang [2 ]
Lin, Shili [1 ]
机构
[1] Ohio State Univ, Dept Stat, Columbus, OH 43210 USA
[2] Univ Cincinnati, Sch Med, Dept Environm Hlth, Cincinnati, OH 45267 USA
基金
美国国家科学基金会;
关键词
Bayesian mixture model; ChIA-PET; R package; DIFFERENTIAL EXPRESSION ANALYSIS; ANDROGEN RECEPTOR; GENE-EXPRESSION; RNA-SEQ; REVEALS; REGIONS; SITES;
D O I
10.1515/sagmb-2014-0029
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Chromatin interactions mediated by a particular protein are of interest for studying gene regulation, especially the regulation of genes that are associated with, or known to be causative of, a disease. A recent molecular technique, Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET), that uses chromatin immunoprecipitation (ChIP) and high throughput paired-end sequencing, is able to detect such chromatin interactions genomewide. However, ChIA-PET may generate noise (i.e., pairings of DNA fragments by random chance) in addition to true signal (i.e., pairings of DNA fragments by interactions). In this paper, we propose MC_DIST based on a mixture modeling framework to identify true chromatin interactions from ChIA-PET count data (counts of DNA fragment pairs). The model is cast into a Bayesian framework to take into account the dependency among the data and the available information on protein binding sites and gene promoters to reduce false positives. A simulation study showed that MC_DIST outperforms the previously proposed hypergeometric model in terms of both power and type I error rate. A real data study showed that MC_DIST may identify potential chromatin interactions between protein binding sites and gene promoters that may be missed by the hypergeometric model. An R package implementing the MC_DIST model is available at http://www.stat.osu.edu/similar to statgen/SOFTWARE/MDM.
引用
收藏
页码:53 / 64
页数:12
相关论文
共 50 条
  • [1] A Bayesian mixture model for clustering circular data
    Rodriguez, Carlos E.
    Nunez-Antonio, Gabriel
    Escarela, Gabriel
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 143
  • [2] A Bayesian mixture model for partitioning gene expression data
    Zhou, Chuan
    Wakefield, Jon
    BIOMETRICS, 2006, 62 (02) : 515 - 525
  • [3] Allowing for the effect of data binning in a Bayesian Normal mixture model
    Alston, C. L.
    Mengersen, K. L.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (04) : 916 - 923
  • [4] Bayesian mixture model based clustering of replicated microarray data
    Medvedovic, M
    Yeung, KY
    Bumgarner, RE
    BIOINFORMATICS, 2004, 20 (08) : 1222 - 1232
  • [5] Clustering sparse binary data with hierarchical Bayesian Bernoulli mixture model
    Ye, Mao
    Zhang, Peng
    Nie, Lizhen
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2018, 123 : 32 - 49
  • [6] On the Bayesian Mixture Model and Identifiability
    Mena, Ramses H.
    Walker, Stephen G.
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2015, 24 (04) : 1155 - 1169
  • [7] A Bayesian mixture model for missing data in marine mammal growth analysis
    Shotwell, Mary E.
    McFee, Wayne E.
    Slate, Elizabeth H.
    ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 2016, 23 (04) : 585 - 603
  • [8] Bayesian Pattern Mixture Model for Longitudinal Binary Data with Nonignorable Missingness
    Kyoung, Yujung
    Lee, Keunbaik
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2015, 22 (06) : 589 - 598
  • [9] A Bayesian Mixture Model for Comparative Spectral Count Data in Shotgun Proteomics
    Booth, James G.
    Eilertson, Kirsten E.
    Olinares, Paul Dominic B.
    Yu, Haiyuan
    MOLECULAR & CELLULAR PROTEOMICS, 2011, 10 (08)
  • [10] A Bayesian mixture model for missing data in marine mammal growth analysis
    Mary E. Shotwell
    Wayne E. McFee
    Elizabeth H. Slate
    Environmental and Ecological Statistics, 2016, 23 : 585 - 603