A Generalized Linear Model for Peak Calling in ChIP-Seq Data

被引:4
|
作者
Xu, Jialin [1 ]
Zhang, Yu [1 ]
机构
[1] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
关键词
generalized linear model; ChIP-Seq; peak calling;
D O I
10.1089/cmb.2012.0023
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) has become a routine for detecting genome-wide protein-DNA interaction. The success of ChIP-Seq data analysis highly depends on the quality of peak calling (i.e., to detect peaks of tag counts at a genomic location and evaluate if the peak corresponds to a real protein-DNA interaction event). The challenges in peak calling include (1) how to combine the forward and the reverse strand tag data to improve the power of peak calling and (2) how to account for the variation of tag data observed across different genomic locations. We introduce a new peak calling method based on the generalized linear model (GLMNB) that utilizes negative binomial distribution to model the tag count data and account for the variation of background tags that may randomly bind to the DNA sequence at varying levels due to local genomic structures and sequence contents. We allow local shifting of peaks observed on the forward and the reverse stands, such that at each potential binding site, a binding profile representing the pattern of a real peak signal is fitted to best explain the observed tag data with maximum likelihood. Our method can also detect multiple peaks within a local region if there are multiple binding sites in the region.
引用
收藏
页码:826 / 838
页数:13
相关论文
共 50 条
  • [21] The Triform algorithm: improved sensitivity and specificity in ChIP-Seq peak finding
    Kornacker, Karl
    Rye, Morten Beck
    Handstad, Tony
    Drablos, Finn
    BMC BIOINFORMATICS, 2012, 13
  • [22] Principles of ChIP-seq Data Analysis Illustrated with Examples
    Ambrosini, Giovanna
    Dreos, Rene
    Bucher, Philipp
    PROCEEDINGS IWBBIO 2014: INTERNATIONAL WORK-CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1 AND 2, 2014, : 682 - 694
  • [23] Integrating ChIP-seq with other functional genomics data
    Jiang, Shan
    Mortazavi, Ali
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2018, 17 (02) : 104 - 115
  • [24] Ancestral transcriptome inference based on RNA-Seq and ChIP-seq data
    Yang, Jingwen
    Ruan, Hang
    Zou, Yangyun
    Su, Zhixi
    Gu, Xun
    METHODS, 2020, 176 : 99 - 105
  • [25] ChIP-chip versus ChIP-seq: Lessons for experimental design and data analysis
    Joshua WK Ho
    Eric Bishop
    Peter V Karchenko
    Nicolas Nègre
    Kevin P White
    Peter J Park
    BMC Genomics, 12
  • [26] HiChIP: a high-throughput pipeline for integrative analysis of ChIP-Seq data
    Yan, Huihuang
    Evans, Jared
    Kalmbach, Mike
    Moore, Raymond
    Middha, Sumit
    Luban, Stanislav
    Wang, Liguo
    Bhagwate, Aditya
    Li, Ying
    Sun, Zhifu
    Chen, Xianfeng
    Kocher, Jean-Pierre A.
    BMC BIOINFORMATICS, 2014, 15
  • [27] HiChIP: a high-throughput pipeline for integrative analysis of ChIP-Seq data
    Huihuang Yan
    Jared Evans
    Mike Kalmbach
    Raymond Moore
    Sumit Middha
    Stanislav Luban
    Liguo Wang
    Aditya Bhagwate
    Ying Li
    Zhifu Sun
    Xianfeng Chen
    Jean-Pierre A Kocher
    BMC Bioinformatics, 15
  • [28] Identification of Enriched Regions in ChIP-Seq Data via a Linear-Time Multi-Level Thresholding Algorithm
    Naik, Musab
    Rueda, Luis
    Vasighizaker, Akram
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (05) : 2842 - 2850
  • [29] jChIP: A graphical environment for exploratory ChIP-Seq data analysis
    Chojnowski K.
    Goryca K.
    Rubel T.
    Mikula M.
    BMC Research Notes, 7 (1)
  • [30] ChIP-Seq Data Analysis to Define Transcriptional Regulatory Networks
    Pavesi, Giulio
    NETWORK BIOLOGY, 2017, 160 : 1 - 14