Fully Bayesian analysis of allele-specific RNA-seq data

被引:0
作者
Alvarez-Castro, Ignacio [1 ]
Niemi, Jarad [2 ]
机构
[1] Univ Republica, Inst Estadist, Montevideo, Uruguay
[2] Iowa State Univ, Dept Stat, Ames, IA 50010 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
hierarchical model; shrinkage priors; allele-specific expression; RNA-seq; Markov chain Monte Carlo; GPU; DIFFERENTIAL EXPRESSION ANALYSIS; COMPLEXITY; MODEL;
D O I
10.3934/mbe.2019389
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Diploid organisms have two copies of each gene, called alleles, that can be separately transcribed. The RNA abundance associated to any particular allele is known as allele-specific expression (ASE). When two alleles have polymorphisms in transcribed regions, ASE can be studied using RNA-seq read count data. ASE has characteristics different from the regular RNA-seq expression: ASE cannot be assessed for every gene, measures of ASE can be biased towards one of the alleles (reference allele), and ASE provides two measures of expression for a single gene for each biological samples with leads to additional complications for single-gene models. We present statistical methods for modeling ASE and detecting genes with differential allelic expression. We propose a hierarchical, overdispersed, count regression model to deal with ASE counts. The model accommodates gene-specific overdispersion, has an internal measure of the reference allele bias, and uses random effects to model the gene-specific regression parameters. Fully Bayesian inference is obtained using the fbseq package that implements a parallel strategy to make the computational times reasonable. Simulation and real data analysis suggest the proposed model is a practical and powerful tool for the study of differential ASE.
引用
收藏
页码:7751 / 7770
页数:20
相关论文
共 39 条
[1]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[2]   A Bivariate Model for Simultaneous Testing in Bioinformatics Data [J].
Bar, Haim Y. ;
Booth, James G. ;
Wells, Martin T. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (506) :537-547
[3]   RNA-Seq Analysis of Allele-Specific Expression, Hybrid Effects, and Regulatory Divergence in Hybrids Compared with Their Parents from Natural Populations [J].
Bell, Graeme D. M. ;
Kane, Nolan C. ;
Rieseberg, Loren H. ;
Adams, Keith L. .
GENOME BIOLOGY AND EVOLUTION, 2013, 5 (07) :1309-1323
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]  
Chen YS, 2014, FRONT PROBAB STAT SC, P51, DOI 10.1007/978-3-319-07212-8_3
[6]  
Datta S., 2014, STAT ANAL NEXT GENER, DOI [10.1007/978-3-319-07212-8.pdf, DOI 10.1007/978-3-319-07212-8.PDF]
[7]   Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data [J].
Degner, Jacob F. ;
Marioni, John C. ;
Pai, Athma A. ;
Pickrell, Joseph K. ;
Nkadori, Everlyne ;
Gilad, Yoav ;
Pritchard, Jonathan K. .
BIOINFORMATICS, 2009, 25 (24) :3207-3212
[8]  
Gelman A., 2013, Bayesian data analysis, V3rd ed.
[9]   Prior distributions for variance parameters in hierarchical models(Comment on an Article by Browne and Draper) [J].
Gelman, Andrew .
BAYESIAN ANALYSIS, 2006, 1 (03) :515-533
[10]  
Ghosh J. K., 2006, INTRO BAYESIAN ANAL, DOI [10.1002/9781118684818.ch16/summary, DOI 10.1002/9781118684818.CH16/SUMMARY]