CADBURE: A generic tool to evaluate the performance of spliced aligners on RNA-Seq data

被引:7
作者
Kumar, Praveen Kumar Raj [1 ]
Hoang, Thanh V. [1 ]
Robinson, Michael L. [1 ]
Tsonis, Panagiotis A. [2 ,3 ]
Liang, Chun [1 ,4 ]
机构
[1] Miami Univ, Dept Biol, Oxford, OH 45056 USA
[2] Univ Dayton, Dept Biol, Dayton, OH 45469 USA
[3] Univ Dayton, Ctr Tissue Regenerat & Engn, Dayton, OH 45469 USA
[4] Miami Univ, Dept Comp Sci & Software Engn, Oxford, OH 45056 USA
来源
SCIENTIFIC REPORTS | 2015年 / 5卷
关键词
TRANSCRIPTOME ANALYSIS; EXPRESSION; ALIGNMENT; READS;
D O I
10.1038/srep13443
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The fundamental task in RNA-Seq-based transcriptome analysis is alignment of millions of short reads to the reference genome or transcriptome. Choosing the right tool for the dataset in hand from many existent RNA-Seq alignment packages remains a critical challenge for downstream analysis. To facilitate this choice, we designed a novel tool for comparing alignment results of user data based on the relative reliability of uniquely aligned reads (CADBURE). CADBURE can easily evaluate different aligners, or different parameter sets using the same aligner, and selects the best alignment result for any RNA-Seq dataset. Strengths of CADBURE include the ability to compare alignment results without the need for synthetic data such as simulated genomes, alignment regeneration and randomly subsampled datasets. The benefit of a CADBURE selected alignment result was supported by differentially expressed gene (DEG) analysis. We demonstrated that the use of CADBURE to select the best alignment from a number of different alignment results could change the number of DEGs by as much as 10%. In particular, the CADBURE selected alignment result favors fewer false positives in the DEG analysis. We also verified differential expression of eighteen genes with RT-qPCR validation experiments. CADBURE is an open source tool (http://cadbure.sourceforge.net/).
引用
收藏
页数:10
相关论文
共 28 条
  • [1] Differential expression analysis for sequence count data
    Anders, Simon
    Huber, Wolfgang
    [J]. GENOME BIOLOGY, 2010, 11 (10):
  • [2] STAR: ultrafast universal RNA-seq aligner
    Dobin, Alexander
    Davis, Carrie A.
    Schlesinger, Felix
    Drenkow, Jorg
    Zaleski, Chris
    Jha, Sonali
    Batut, Philippe
    Chaisson, Mark
    Gingeras, Thomas R.
    [J]. BIOINFORMATICS, 2013, 29 (01) : 15 - 21
  • [3] Efrom B., 1986, Statistical Science, V1, P54, DOI DOI 10.1214/SS/1177013815
  • [4] Engström PG, 2013, NAT METHODS, V10, P1185, DOI [10.1038/NMETH.2722, 10.1038/nmeth.2722]
  • [5] Ensembl 2013
    Flicek, Paul
    Ahmed, Ikhlak
    Amode, M. Ridwan
    Barrell, Daniel
    Beal, Kathryn
    Brent, Simon
    Carvalho-Silva, Denise
    Clapham, Peter
    Coates, Guy
    Fairley, Susan
    Fitzgerald, Stephen
    Gil, Laurent
    Garcia-Giron, Carlos
    Gordon, Leo
    Hourlier, Thibaut
    Hunt, Sarah
    Juettemann, Thomas
    Kaehaeri, Andreas K.
    Keenan, Stephen
    Komorowska, Monika
    Kulesha, Eugene
    Longden, Ian
    Maurel, Thomas
    McLaren, William M.
    Muffato, Matthieu
    Nag, Rishi
    Overduin, Bert
    Pignatelli, Miguel
    Pritchard, Bethan
    Pritchard, Emily
    Riat, Harpreet Singh
    Ritchie, Graham R. S.
    Ruffier, Magali
    Schuster, Michael
    Sheppard, Daniel
    Sobral, Daniel
    Taylor, Kieron
    Thormann, Anja
    Trevanion, Stephen
    White, Simon
    Wilder, Steven P.
    Aken, Bronwen L.
    Birney, Ewan
    Cunningham, Fiona
    Dunham, Ian
    Harrow, Jennifer
    Herrero, Javier
    Hubbard, Tim J. P.
    Johnson, Nathan
    Kinsella, Rhoda
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) : D48 - D55
  • [6] Tools for mapping high-throughput sequencing data
    Fonseca, Nuno A.
    Rung, Johan
    Brazma, Alvis
    Marioni, John C.
    [J]. BIOINFORMATICS, 2012, 28 (24) : 3169 - 3177
  • [7] Garber M, 2011, NAT METHODS, V8, P469, DOI [10.1038/NMETH.1613, 10.1038/nmeth.1613]
  • [8] Specificity control for read alignments using an artificial reference genome-guided false discovery rate
    Giese, Sven H.
    Zickmann, Franziska
    Renard, Bernhard Y.
    [J]. BIOINFORMATICS, 2014, 30 (01) : 9 - 16
  • [9] Hoang TV, 2014, MOL VIS, V20, P1491
  • [10] Jean G., 2010, BIOINFORMATICS, V32