Bayesian negative binomial regression for differential expression with confounding factors

被引:12
作者
Dadaneh, Siamak Zamani [1 ]
Zhou, Mingyuan [2 ]
Qian, Xiaoning [1 ]
机构
[1] Texas A&M Univ, TEES AgriLife Ctr Bioinformat & Genom Syst Engn, Dept Elect & Comp Engn, College Stn, TX 77843 USA
[2] Univ Texas Austin, Dept Informat Risk & Operat Management, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
RNA-SEQ; BIOCONDUCTOR; SOFTWARE; PACKAGE; COUNT;
D O I
10.1093/bioinformatics/bty330
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Rapid adoption of high-throughput sequencing technologies has enabled better understanding of genome-wide molecular profile changes associated with phenotypic differences in biomedical studies. Often, these changes are due to multiple interacting factors. Existing methods are mostly considering differential expression across two conditions studying one main factor without considering other confounding factors. In addition, they are often coupled with essential sophisticated ad-hoc pre-processing steps such as normalization, restricting their adaptability to general experimental setups. Complex multi-factor experimental design to accurately decipher genotype-phenotype relationships signifies the need for developing effective statistical tools for genome-scale sequencing data profiled under multi-factor conditions. Results: We have developed a novel Bayesian negative binomial regression (BNB-R) method for the analysis of RNA sequencing (RNA-seq) count data. In particular, the natural model parameterization removes the needs for the normalization step, while the method is capable of tackling complex experimental design involving multi-variate dependence structures. Efficient Bayesian inference of model parameters is obtained by exploiting conditional conjugacy via novel data augmentation techniques. Comprehensive studies on both synthetic and real-world RNA-seq data demonstrate the superior performance of BNB-R in terms of the areas under both the receiver operating characteristic and precision-recall curves. Availability and implementation: BNB-R is implemented in R language and is available at https://github.com/siamakz/BNBR.
引用
收藏
页码:3349 / 3356
页数:8
相关论文
共 40 条
  • [1] Methods for time series analysis of RNA-seq data with application to human Th17 cell differentiation
    Aijo, Tarmo
    Butty, Vincent
    Chen, Zhi
    Salo, Verna
    Tripathi, Subhash
    Burge, Christopher B.
    Lahesmaa, Riitta
    Lahdesmaki, Harri
    [J]. BIOINFORMATICS, 2014, 30 (12) : 113 - 120
  • [2] Differential expression analysis for sequence count data
    Anders, Simon
    Huber, Wolfgang
    [J]. GENOME BIOLOGY, 2010, 11 (10):
  • [3] An introduction to MCMC for machine learning
    Andrieu, C
    de Freitas, N
    Doucet, A
    Jordan, MI
    [J]. MACHINE LEARNING, 2003, 50 (1-2) : 5 - 43
  • [4] [Anonymous], 2013, Econometric Analysis of Count Data
  • [5] Boluki S., 2017, IEEE ACM T COMPUTATI
  • [6] Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors
    Boluki, Shahin
    Esfahani, Mohammad Shahrokh
    Qian, Xiaoning
    Dougherty, Edward R.
    [J]. BMC BIOINFORMATICS, 2017, 18
  • [7] GO::TermFinder - open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes
    Boyle, EI
    Weng, SA
    Gollub, J
    Jin, H
    Botstein, D
    Cherry, JM
    Sherlock, G
    [J]. BIOINFORMATICS, 2004, 20 (18) : 3710 - 3715
  • [8] A subpopulation model to analyze heterogeneous cell differentiation dynamics
    Chan, Yat Hin
    Intosalmi, Jukka
    Rautio, Sini
    Lahdesmaki, Harri
    [J]. BIOINFORMATICS, 2016, 32 (21) : 3306 - 3313
  • [9] UNDERSTANDING THE METROPOLIS-HASTINGS ALGORITHM
    CHIB, S
    GREENBERG, E
    [J]. AMERICAN STATISTICIAN, 1995, 49 (04) : 327 - 335
  • [10] BNP-Seq: Bayesian Nonparametric Differential Expression Analysis of Sequencing Count Data
    Dadaneh, Siamak Zamani
    Qian, Xiaoning
    Zhou, Mingyuan
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (521) : 81 - 94