Accurate Estimation of Expression Levels of Homologous Genes in RNA-seq Experiments

被引:18
作者
Pasaniuc, Bogdan [1 ,2 ]
Zaitlen, Noah [1 ,2 ]
Halperin, Eran [3 ,4 ,5 ]
机构
[1] Harvard Univ, Sch Publ Hlth, Dept Epidemiol, Boston, MA 02115 USA
[2] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[3] Int Comp Sci Inst, Berkeley, CA 94704 USA
[4] Tel Aviv Univ, Mol Microbiol & Biotechnol Dept, IL-69978 Tel Aviv, Israel
[5] Tel Aviv Univ, Blavatnik Sch Comp Sci, IL-69978 Tel Aviv, Israel
基金
美国国家科学基金会; 以色列科学基金会;
关键词
algorithms; gene searching; genetic mapping; genetic variation; TRANSCRIPTOMES; REVEALS; GENOME; MOUSE;
D O I
10.1089/cmb.2010.0259
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Next generation high-throughput sequencing (NGS) is poised to replace array-based technologies as the experiment of choice for measuring RNA expression levels. Several groups have demonstrated the power of this new approach (RNA-seq), making significant and novel contributions and simultaneously proposing methodologies for the analysis of RNA-seq data. In a typical experiment, millions of short sequences (reads) are sampled from RNA extracts and mapped back to a reference genome. The number of reads mapping to each gene is used as proxy for its corresponding RNA concentration. A significant challenge in analyzing RNA expression of homologous genes is the large fraction of the reads that map to multiple locations in the reference genome. Currently, these reads are either dropped from the analysis, or a naive algorithm is used to estimate their underlying distribution. In this work, we present a rigorous alternative for handling the reads generated in an RNA-seq experiment within a probabilistic model for RNA-seq data; we develop maximum likelihood-based methods for estimating the model parameters. In contrast to previous methods, our model takes into account the fact that the DNA of the sequenced individual is not a perfect copy of the reference sequence. We show with both simulated and real RNA-seq data that our new method improves the accuracy and power of RNA-seq experiments.
引用
收藏
页码:459 / 468
页数:10
相关论文
共 50 条
  • [31] Differential expression analysis of RNA-seq data at single-base resolution
    Frazee, Alyssa C.
    Sabunciyan, Sarven
    Hansen, Kasper D.
    Irizarry, Rafael A.
    Leek, Jeffrey T.
    BIOSTATISTICS, 2014, 15 (03) : 413 - 426
  • [32] A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq
    Macias-Munoz, Aide
    Mortazavi, Ali
    JOVE-JOURNAL OF VISUALIZED EXPERIMENTS, 2021, (171):
  • [33] Effect of Low-Expression Gene Filtering on Detection of Differentially Expressed Genes in RNA-Seq Data
    Sha, Ying
    Phan, John H.
    Wang, May D.
    2015 37TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2015, : 6461 - 6464
  • [34] Resolving candidate genes of mouse skeletal muscle QTL via RNA-Seq and expression network analyses
    Lionikas, Arimantas
    Meharg, Caroline
    Derry, Jonathan M. J.
    Ratkevicius, Aivaras
    Carroll, Andrew M.
    Vandenbergh, David J.
    Blizard, David A.
    BMC GENOMICS, 2012, 13
  • [35] Allele-specific RNA-seq expression profiling of imprinted genes in mouse isogenic pluripotent states
    Dirks, Rene A. M.
    van Mierlo, Guido
    Kerstens, Hindrik H. D.
    Bernardo, Andreia S.
    Kobolak, Julianna
    Bock, Istvan
    Maruotti, Julien
    Pedersen, Roger A.
    Dinnyes, Andras
    Huynen, Martijn A.
    Jouneau, Alice
    Marks, Hendrik
    EPIGENETICS & CHROMATIN, 2019, 12 (1)
  • [36] Improving RNA-Seq expression estimates by correcting for fragment bias
    Roberts, Adam
    Trapnell, Cole
    Donaghey, Julie
    Rinn, John L.
    Pachter, Lior
    GENOME BIOLOGY, 2011, 12 (03):
  • [37] An mRNA expression atlas for the duck with public RNA-seq datasets
    Tao, Qiuyu
    Huang, Anqi
    Qi, Jingjing
    Yang, Zhao
    Guo, Shihao
    Lu, Yinjuan
    He, Xinxin
    Han, Xu
    Jiang, Shuaixue
    Xu, Mengru
    Bai, Yuan
    Zhang, Tao
    Hu, Shenqiang
    Li, Liang
    Bai, Lili
    Liu, Hehe
    BMC GENOMICS, 2025, 26 (01):
  • [38] Investigation of Factors Affecting RNA-Seq Gene Expression Calls
    Harati, Sahar
    Phan, John H.
    Wang, May D.
    2014 36TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2014, : 5232 - 5235
  • [39] Identification of nuclear genes controlling chlorophyll synthesis in barley by RNA-seq
    Shmakov, Nickolay A.
    Vasiliev, Gennadiy V.
    Shatskaya, Natalya V.
    Doroshkov, Alexey V.
    Gordeeva, Elena I.
    Afonnikov, Dmitry A.
    Khlestkina, Elena K.
    BMC PLANT BIOLOGY, 2016, 16
  • [40] The transcriptome of Leishmania major in the axenic promastigote stage: transcript annotation and relative expression levels by RNA-seq
    Rastrojo, Alberto
    Carrasco-Ramiro, Fernando
    Martin, Diana
    Crespillo, Antonio
    Reguera, Rosa M.
    Aguado, Begona
    Requena, Jose M.
    BMC GENOMICS, 2013, 14