TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads

被引:28
|
作者
Nariai, Naoki [1 ]
Kojima, Kaname [1 ]
Mimori, Takahiro [1 ]
Sato, Yukuto [1 ]
Kawai, Yosuke [1 ]
Yamaguchi-Kabata, Yumi [1 ]
Nagasaki, Masao [1 ]
机构
[1] Tohoku Univ, Tohoku Med Megabank Org, Dept Integrat Genom, Aoba Ku, Sendai, Miyagi 9808573, Japan
来源
BMC GENOMICS | 2014年 / 15卷
关键词
REFERENCE GENOME; ALIGNMENT; GENE; QUANTIFICATION; REVEALS;
D O I
10.1186/1471-2164-15-S10-S5
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: High-throughput RNA sequencing (RNA-Seq) enables quantification and identification of transcripts at single-base resolution. Recently, longer sequence reads become available thanks to the development of new types of sequencing technologies as well as improvements in chemical reagents for the Next Generation Sequencers. Although several computational methods have been proposed for quantifying gene expression levels from RNA-Seq data, they are not sufficiently optimized for longer reads (e.g. > 250 bp). Results: We propose TIGAR2, a statistical method for quantifying transcript isoforms from fixed and variable length RNA-Seq data. Our method models substitution, deletion, and insertion errors of sequencers based on gapped-alignments of reads to the reference cDNA sequences so that sensitive read-aligners such as Bowtie2 and BWA-MEM are effectively incorporated in our pipeline. Also, a heuristic algorithm is implemented in variational Bayesian inference for faster computation. We apply TIGAR2 to both simulation data and real data of human samples and evaluate performance of transcript quantification with TIGAR2 in comparison to existing methods. Conclusions: TIGAR2 is a sensitive and accurate tool for quantifying transcript isoform abundances from RNA-Seq data. Our method performs better than existing methods for the fixed-length reads (100 bp, 250 bp, 500 bp, and 1000 bp of both single-end and paired-end) and variable-length reads, especially for reads longer than 250 bp.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Polyester: simulating RNA-seq datasets with differential transcript expression
    Frazee, Alyssa C.
    Jaffe, Andrew E.
    Langmead, Ben
    Leek, Jeffrey T.
    BIOINFORMATICS, 2015, 31 (17) : 2778 - 2784
  • [22] Trimming of sequence reads alters RNA-Seq gene expression estimates
    Williams, Claire R.
    Baccarella, Alyssa
    Parrish, Jay Z.
    Kim, Charles C.
    BMC BIOINFORMATICS, 2016, 17
  • [23] Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks
    Trapnell, Cole
    Roberts, Adam
    Goff, Loyal
    Pertea, Geo
    Kim, Daehwan
    Kelley, David R.
    Pimentel, Harold
    Salzberg, Steven L.
    Rinn, John L.
    Pachter, Lior
    NATURE PROTOCOLS, 2012, 7 (03) : 562 - 578
  • [24] SSP: An interval integer linear programming for de novo transcriptome assembly and isoform discovery of RNA-seq reads
    Safikhani, Zhaleh
    Sadeghi, Mehdi
    Pezeshk, Hamid
    Eslahchi, Changiz
    GENOMICS, 2013, 102 (5-6) : 507 - 514
  • [25] Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown
    Pertea, Mihaela
    Kim, Daehwan
    Pertea, Geo M.
    Leek, Jeffrey T.
    Salzberg, Steven L.
    NATURE PROTOCOLS, 2016, 11 (09) : 1650 - 1667
  • [26] BaRTv2: a highly resolved barley reference transcriptome for accurate transcript-specific RNA-seq quantification
    Coulter, Max
    Entizne, Juan Carlos
    Guo, Wenbin
    Bayer, Micha
    Wonneberger, Ronja
    Milne, Linda
    Schreiber, Miriam
    Haaning, Allison
    Muehlbauer, Gary J.
    McCallum, Nicola
    Fuller, John
    Simpson, Craig
    Stein, Nils
    Brown, John W. S.
    Waugh, Robbie
    Zhang, Runxuan
    PLANT JOURNAL, 2022, 111 (04) : 1183 - 1202
  • [27] Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling
    Labaj, Pawel P.
    Leparc, German G.
    Linggi, Bryan E.
    Markillie, Lye Meng
    Wiley, H. Steven
    Kreil, David P.
    BIOINFORMATICS, 2011, 27 (13) : I383 - I391
  • [28] Comparative evaluation of isoform-level gene expression estimation algorithms for RNA-seq and exon-array platforms
    Dapas, Matthew
    Kandpal, Manoj
    Bi, Yingtao
    Davuluri, Ramana V.
    BRIEFINGS IN BIOINFORMATICS, 2017, 18 (02) : 260 - 269
  • [29] Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts
    Ntranos, Vasilis
    Kamath, Govinda M.
    Zhang, Jesse M.
    Pachter, Lior
    Tse, David N.
    GENOME BIOLOGY, 2016, 17
  • [30] RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome
    Li, Bo
    Dewey, Colin N.
    BMC BIOINFORMATICS, 2011, 12