TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads

被引：28

作者：

Nariai, Naoki ^{[1
]}

Kojima, Kaname ^{[1
]}

Mimori, Takahiro ^{[1
]}

Sato, Yukuto ^{[1
]}

Kawai, Yosuke ^{[1
]}

Yamaguchi-Kabata, Yumi ^{[1
]}

Nagasaki, Masao ^{[1
]}

机构：

[1] Tohoku Univ, Tohoku Med Megabank Org, Dept Integrat Genom, Aoba Ku, Sendai, Miyagi 9808573, Japan

来源：

BMC GENOMICS | 2014年 / 15卷

关键词：

REFERENCE GENOME; ALIGNMENT; GENE; QUANTIFICATION; REVEALS;

D O I：

10.1186/1471-2164-15-S10-S5

中图分类号：

Q81 [生物工程学（生物技术）]; Q93 [微生物学];

学科分类号：

071005 ; 0836 ; 090102 ; 100705 ;

摘要：

Background: High-throughput RNA sequencing (RNA-Seq) enables quantification and identification of transcripts at single-base resolution. Recently, longer sequence reads become available thanks to the development of new types of sequencing technologies as well as improvements in chemical reagents for the Next Generation Sequencers. Although several computational methods have been proposed for quantifying gene expression levels from RNA-Seq data, they are not sufficiently optimized for longer reads (e.g. > 250 bp). Results: We propose TIGAR2, a statistical method for quantifying transcript isoforms from fixed and variable length RNA-Seq data. Our method models substitution, deletion, and insertion errors of sequencers based on gapped-alignments of reads to the reference cDNA sequences so that sensitive read-aligners such as Bowtie2 and BWA-MEM are effectively incorporated in our pipeline. Also, a heuristic algorithm is implemented in variational Bayesian inference for faster computation. We apply TIGAR2 to both simulation data and real data of human samples and evaluate performance of transcript quantification with TIGAR2 in comparison to existing methods. Conclusions: TIGAR2 is a sensitive and accurate tool for quantifying transcript isoform abundances from RNA-Seq data. Our method performs better than existing methods for the fixed-length reads (100 bp, 250 bp, 500 bp, and 1000 bp of both single-end and paired-end) and variable-length reads, especially for reads longer than 250 bp.

引用

页数：9

共 50 条

[31] Hierarchical analysis of RNA-seq reads improves the accuracy of allele-specific expression
Raghupathy, Narayanan
Choi, Kwangbom
Vincent, Matthew J.
Beane, Glen L.
Sheppard, Keith S.
Munger, Steven C.
Korstanje, Ron
Pardo-Manual de Villena, Fernando
Churchill, Gary A.
BIOINFORMATICS, 2018, 34 (13) : 2177 - 2184
[32] Immunoglobulin transcript sequence and somatic hypermutation computation from unselected RNA-seq reads in chronic lymphocytic leukemia
Blachly, James S.
Ruppert, Amy S.
Zhao, Weiqiang
Long, Susan
Flynn, Joseph
Flinn, Ian
Jones, Jeffrey
Maddocks, Kami
Andritsos, Leslie
Ghia, Emanuela M.
Rassenti, Laura Z.
Kipps, Thomas J.
de la Chapelle, Albert
Byrd, John C.
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (14) : 4322 - 4327
[33] Evaluation of Normalization Methods for RNA-Seq Gene Expression Estimation
Wu, Po-Yen
Phan, John H.
Zhou, Fengfeng
Wang, May D.
2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, 2011, : 50 - 57
[34] Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation
Trapnell, Cole
Williams, Brian A.
Pertea, Geo
Mortazavi, Ali
Kwan, Gordon
van Baren, Marijke J.
Salzberg, Steven L.
Wold, Barbara J.
Pachter, Lior
NATURE BIOTECHNOLOGY, 2010, 28 (05) : 511 - U174
[35] A novel min-cost flow method for estimating transcript expression with RNA-Seq
Tomescu, Alexandru I.
Kuosmanen, Anna
Rizzi, Romeo
Makinen, Veli
BMC BIOINFORMATICS, 2013, 14
[36] Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation
Love, Michael I.
Hogenesch, John B.
Irizarry, Rafael A.
NATURE BIOTECHNOLOGY, 2016, 34 (12) : 1287 - 1291
[37] MGMR: leveraging RNA-Seq population data to optimize expression estimation
Rozov, Roye
Halperin, Eran
Shamir, Ron
BMC BIOINFORMATICS, 2012, 13
[38] BaRTv1.0: an improved barley reference transcript dataset to determine accurate changes in the barley transcriptome using RNA-seq
Rapazote-Flores, Paulo
Bayer, Micha
Milne, Linda
Mayer, Claus-Dieter
Fuller, John
Guo, Wenbin
Hedley, Pete E.
Morris, Jenny
Halpin, Claire
Kam, Jason
Mckim, Sarah M.
Zwirek, Monika
Casao, M. Cristina
Barakate, Abdellah
Schreiber, Miriam
Stephen, Gordon
Zhang, Runxuan
Brown, John W. S.
Waugh, Robbie
Simpson, Craig G.
BMC GENOMICS, 2019, 20 (01)
[39] Sensitive, reliable and robust circRNA detection from RNA-seq with CirComPara2
Gaffo, Enrico
Buratin, Alessia
Dal Molin, Anna
Bortoluzzi, Stefania
BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
[40] Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation
Li, Jingyi Jessica
Jiang, Ci-Ren
Brown, James B.
Huang, Haiyan
Bickel, Peter J.
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (50) : 19867 - 19872

← 1 2 3 4 5 →