A two-parameter generalized Poisson model to improve the analysis of RNA-seq data

被引:93
作者
Srivastava, Sudeep [1 ]
Chen, Liang [1 ]
机构
[1] Univ So Calif, Dept Biol Sci, Los Angeles, CA 90089 USA
基金
美国国家卫生研究院;
关键词
DIFFERENTIAL EXPRESSION; HUMAN TRANSCRIPTOME; GENOME; NORMALIZATION; ALIGNMENT;
D O I
10.1093/nar/gkq670
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Deep sequencing of RNAs (RNA-seq) has been a useful tool to characterize and quantify transcriptomes. However, there are significant challenges in the analysis of RNA-seq data, such as how to separate signals from sequencing bias and how to perform reasonable normalization. Here, we focus on a fundamental question in RNA-seq analysis: the distribution of the position-level read counts. Specifically, we propose a two-parameter generalized Poisson (GP) model to the position-level read counts. We show that the GP model fits the data much better than the traditional Poisson model. Based on the GP model, we can better estimate gene or exon expression, perform a more reasonable normalization across different samples, and improve the identification of differentially expressed genes and the identification of differentially spliced exons. The usefulness of the GP model is demonstrated by applications to multiple RNA-seq data sets.
引用
收藏
页码:e170 / e170
页数:15
相关论文
共 25 条
  • [1] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [2] Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments
    Bullard, James H.
    Purdom, Elizabeth
    Hansen, Kasper D.
    Dudoit, Sandrine
    [J]. BMC BIOINFORMATICS, 2010, 11
  • [3] Evaluation of DNA microarray results with quantitative gene expression platforms
    Canales, Roger D.
    Luo, Yuling
    Willey, James C.
    Austermiller, Bradley
    Barbacioru, Catalin C.
    Boysen, Cecilie
    Hunkapiller, Kathryn
    Jensen, Roderick V.
    Knight, Charles R.
    Lee, Kathleen Y.
    Ma, Yunqing
    Maqsodi, Botoul
    Papallo, Adam
    Peters, Elizabeth Herness
    Poulter, Karen
    Ruppel, Patricia L.
    Samaha, Raymond R.
    Shi, Leming
    Yang, Wen
    Zhang, Lu
    Goodsaid, Federico M.
    [J]. NATURE BIOTECHNOLOGY, 2006, 24 (09) : 1115 - 1122
  • [4] Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines
    Castle, John C.
    Zhang, Chaolin
    Shah, Jyoti K.
    Kulkarni, Amit V.
    Kalsotra, Auinash
    Cooper, Thomas A.
    Johnson, Jason M.
    [J]. NATURE GENETICS, 2008, 40 (12) : 1416 - 1425
  • [5] Stem cell transcriptome profiling via massive-scale mRNA sequencing
    Cloonan, Nicole
    Forrest, Alistair R. R.
    Kolle, Gabriel
    Gardiner, Brooke B. A.
    Faulkner, Geoffrey J.
    Brown, Mellissa K.
    Taylor, Darrin F.
    Steptoe, Anita L.
    Wani, Shivangi
    Bethel, Graeme
    Robertson, Alan J.
    Perkins, Andrew C.
    Bruce, Stephen J.
    Lee, Clarence C.
    Ranade, Swati S.
    Peckham, Heather E.
    Manning, Jonathan M.
    McKernan, Kevin J.
    Grimmond, Sean M.
    [J]. NATURE METHODS, 2008, 5 (07) : 613 - 619
  • [6] Consul P.C., 1989, Generalized Poisson Distributions: Properties and Applications
  • [7] CONSUL PC, 1974, SANKHYA SER B, V36, P391
  • [8] SOME INTERESTING PROPERTIES OF GENERALIZED POISSON DISTRIBUTION
    CONSUL, PC
    JAIN, GC
    [J]. BIOMETRISCHE ZEITSCHRIFT, 1973, 15 (07): : 495 - 500
  • [9] GENERALIZATION OF POISSON DISTRIBUTION
    CONSUL, PC
    JAIN, GC
    [J]. TECHNOMETRICS, 1973, 15 (04) : 791 - 799
  • [10] DAVID: Database for annotation, visualization, and integrated discovery
    Dennis, G
    Sherman, BT
    Hosack, DA
    Yang, J
    Gao, W
    Lane, HC
    Lempicki, RA
    [J]. GENOME BIOLOGY, 2003, 4 (09)