Accurate Estimation of Expression Levels of Homologous Genes in RNA-seq Experiments

被引:18
|
作者
Pasaniuc, Bogdan [1 ,2 ]
Zaitlen, Noah [1 ,2 ]
Halperin, Eran [3 ,4 ,5 ]
机构
[1] Harvard Univ, Sch Publ Hlth, Dept Epidemiol, Boston, MA 02115 USA
[2] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[3] Int Comp Sci Inst, Berkeley, CA 94704 USA
[4] Tel Aviv Univ, Mol Microbiol & Biotechnol Dept, IL-69978 Tel Aviv, Israel
[5] Tel Aviv Univ, Blavatnik Sch Comp Sci, IL-69978 Tel Aviv, Israel
基金
美国国家科学基金会; 以色列科学基金会;
关键词
algorithms; gene searching; genetic mapping; genetic variation; TRANSCRIPTOMES; REVEALS; GENOME; MOUSE;
D O I
10.1089/cmb.2010.0259
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Next generation high-throughput sequencing (NGS) is poised to replace array-based technologies as the experiment of choice for measuring RNA expression levels. Several groups have demonstrated the power of this new approach (RNA-seq), making significant and novel contributions and simultaneously proposing methodologies for the analysis of RNA-seq data. In a typical experiment, millions of short sequences (reads) are sampled from RNA extracts and mapped back to a reference genome. The number of reads mapping to each gene is used as proxy for its corresponding RNA concentration. A significant challenge in analyzing RNA expression of homologous genes is the large fraction of the reads that map to multiple locations in the reference genome. Currently, these reads are either dropped from the analysis, or a naive algorithm is used to estimate their underlying distribution. In this work, we present a rigorous alternative for handling the reads generated in an RNA-seq experiment within a probabilistic model for RNA-seq data; we develop maximum likelihood-based methods for estimating the model parameters. In contrast to previous methods, our model takes into account the fact that the DNA of the sequenced individual is not a perfect copy of the reference sequence. We show with both simulated and real RNA-seq data that our new method improves the accuracy and power of RNA-seq experiments.
引用
收藏
页码:459 / 468
页数:10
相关论文
共 50 条
  • [21] DART: a fast and accurate RNA-seq mapper with a partitioning strategy
    Lin, Hsin-Nan
    Hsu, Wen-Lian
    BIOINFORMATICS, 2018, 34 (02) : 190 - 197
  • [22] Estimation of alternative splicing isoform frequencies from RNA-Seq data
    Nicolae, Marius
    Mangul, Serghei
    Mandoiu, Ion I.
    Zelikovsky, Alex
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2011, 6
  • [23] Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads
    Li, Wei
    Jiang, Tao
    BIOINFORMATICS, 2012, 28 (22) : 2914 - 2921
  • [24] Joint estimation of isoform expression and isoform-specific read distribution using multisample RNA-Seq data
    Suo, Chen
    Calza, Stefano
    Salim, Agus
    Pawitan, Yudi
    BIOINFORMATICS, 2014, 30 (04) : 506 - 513
  • [25] Modelling and simulating generic RNA-Seq experiments with the flux simulator
    Griebel, Thasso
    Zacher, Benedikt
    Ribeca, Paolo
    Raineri, Emanuele
    Lacroix, Vincent
    Guigo, Roderic
    Sammeth, Michael
    NUCLEIC ACIDS RESEARCH, 2012, 40 (20) : 10073 - 10083
  • [26] Acfs: accurate circRNA identification and quantification from RNA-Seq data
    You, Xintian
    Conrad, Tim O. F.
    SCIENTIFIC REPORTS, 2016, 6
  • [27] RNA-Seq Transcriptome Profiling Reveals Differentially Expressed Genes Involved in Sex Expression in Melon
    Gao, Peng
    Sheng, Yunyan
    Luan, Feishi
    Ma, Hongyan
    Liu, Shi
    CROP SCIENCE, 2015, 55 (04) : 1686 - 1695
  • [28] WemIQ: an accurate and robust isoform quantification method for RNA-seq data
    Zhang, Jing
    Kuo, C. -C. Jay
    Chen, Liang
    BIOINFORMATICS, 2015, 31 (06) : 878 - 885
  • [29] MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery
    Wang, Kai
    Singh, Darshan
    Zeng, Zheng
    Coleman, Stephen J.
    Huang, Yan
    Savich, Gleb L.
    He, Xiaping
    Mieczkowski, Piotr
    Grimm, Sara A.
    Perou, Charles M.
    MacLeod, James N.
    Chiang, Derek Y.
    Prins, Jan F.
    Liu, Jinze
    NUCLEIC ACIDS RESEARCH, 2010, 38 (18) : e178
  • [30] Using RNA-seq data to select reference genes for normalizing gene expression in apple roots
    Zhou, Zhe
    Cong, Peihua
    Tian, Yi
    Zhu, Yanmin
    PLOS ONE, 2017, 12 (09):