Bayesian analysis of gene essentiality based on sequencing of transposon insertion libraries

被引:55
作者
DeJesus, Michael A. [1 ]
Zhang, Yanjia J. [2 ]
Sassetti, Christopher M. [3 ]
Rubin, Eric J. [2 ]
Sacchettini, James C. [4 ]
Ioerger, Thomas R. [1 ]
机构
[1] Texas A&M Univ, Dept Comp Sci, College Stn, TX 77843 USA
[2] Harvard Univ, Sch Publ Hlth, Dept Immunol & Infect Dis, Boston, MA 02115 USA
[3] Univ Massachusetts, Sch Med, Dept Microbiol & Physiol Syst, Worcester, MA 01655 USA
[4] Texas A&M Univ, Dept Biochem & Biophys, College Stn, TX 77843 USA
关键词
GROWTH IN-VITRO; MYCOBACTERIUM-TUBERCULOSIS; MUTAGENESIS; GENOME; BIOSYNTHESIS; SURVIVAL; SYSTEM; VIVO;
D O I
10.1093/bioinformatics/btt043
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Next-generation sequencing affords an efficient analysis of transposon insertion libraries, which can be used to identify essential genes in bacteria. To analyse this high-resolution data, we present a formal Bayesian framework for estimating the posterior probability of essentiality for each gene, using the extreme-value distribution to characterize the statistical significance of the longest region lacking insertions within a gene. We describe a sampling procedure based on the Metropolis-Hastings algorithm to calculate posterior probabilities of essentiality while simultaneously integrating over unknown internal parameters. Results: Using a sequence dataset from a transposon library for Mycobacterium tuberculosis, we show that this Bayesian approach predicts essential genes that correspond well with genes shown to be essential in previous studies. Furthermore, we show that by using the extreme-value distribution to characterize genomic regions lacking transposon insertions, this method is capable of identifying essential domains within genes. This approach can be used for analysing transposon libraries in other organisms and augmenting essentiality predictions with statistical confidence scores. Availability: A python script implementing the method described is available for download from http://saclab.tamu.edu/essentiality/. Contact: michael.dejesus@tamu.edu or ioerger@cs.tamu.edu Supplementary information: Supplementary data are available at Bioinformatics online.
引用
收藏
页码:695 / 703
页数:9
相关论文
共 35 条
[31]   Functional analysis of the genes of yeast chromosome V by genetic footprinting [J].
Smith, V ;
Chou, KN ;
Lashkari, D ;
Botstein, D ;
Brown, PO .
SCIENCE, 1996, 274 (5295) :2069-2074
[32]   Domain size distributions can predict domain boundaries [J].
Wheelan, SJ ;
Marchler-Bauer, A ;
Bryant, SH .
BIOINFORMATICS, 2000, 16 (07) :613-618
[33]   The Mycobacterium tuberculosis β-oxidation genes echA5 and fadB3 are dispensable for growth in vitro and in vivo [J].
Williams, Kerstin J. ;
Boshoff, Helena I. ;
Krishnan, Nitya ;
Gonzales, Jacqueline ;
Schnappinger, Dirk ;
Robertson, Brian D. .
TUBERCULOSIS, 2011, 91 (06) :549-555
[34]   Global Assessment of Genomic Regions Required for Growth in Mycobacterium tuberculosis [J].
Zhang, Yanjia J. ;
Ioerger, Thomas R. ;
Huttenhower, Curtis ;
Long, Jarukit E. ;
Sassetti, Christopher M. ;
Sacchettini, James C. ;
Rubin, Eric J. .
PLOS PATHOGENS, 2012, 8 (09)
[35]   ESSENTIALS: Software for Rapid Analysis of High Throughput Transposon Insertion Sequencing Data [J].
Zomer, Aldert ;
Burghout, Peter ;
Bootsma, Hester J. ;
Hermans, Peter W. M. ;
van Hijum, Sacha A. F. T. .
PLOS ONE, 2012, 7 (08)