Computational Considerations in Transcriptome Assemblies and Their Evaluation, using High Quality Human RNA-Seq data

被引:0
作者
Ghaffari, Noushin [1 ]
Abante, Jordi [1 ]
Singh, Raminder [2 ]
Blood, Philip D. [3 ]
Johnson, Charles D. [1 ]
机构
[1] Texas A&M AgriLife Res, CBGSE, AgriLife Genom & Bioinformat, 101 Gateway Blv, College Stn, TX 77845 USA
[2] Indiana Univ Res Technol, 2709 East 10th St, Bloomington, IN 47408 USA
[3] Pittsburgh Supercomp Ctr, 300 S Craig St, Pittsburgh, PA 15213 USA
来源
PROCEEDINGS OF XSEDE16: DIVERSITY, BIG DATA, AND SCIENCE AT SCALE | 2016年
基金
美国国家科学基金会;
关键词
RNA-Seq; Next-generation sequencing; Transcriptome assembly; Quality control; High performance computing; Data management challenges; ALIGNMENT;
D O I
10.1145/2949550.2949572
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It is crucial to understand the performance of transcriptome assemblies to improve current practices. Investigating the factors that affect a transcriptome assembly is very important and is the primary goal of our project. To that end, we designed a multi-step pipeline consisting of variety of pre-processing and quality control steps. XSEDE allocations enabled us to achieve the computational demands of the project. The high memory Blacklight and Green field systems at Pittsburgh Supercomputing Center were essential to accomplish multiple steps of this project. This paper presents the computational aspects of our comprehensive transcriptome assembly and validation study.
引用
收藏
页数:4
相关论文
共 15 条
  • [1] Comparison of assembly algorithms for improving rate of metatranscriptomic functional annotation
    Celaj, Albi
    Markle, Janet
    Danska, Jayne
    Parkinson, John
    [J]. MICROBIOME, 2014, 2
  • [2] Dodt Matthias, 2012, Biology (Basel), V1, P895, DOI 10.3390/biology1030895
  • [3] Ghaffari N., 2015, IEEE ACM T COMPUTATI, V1
  • [4] Full-length transcriptome assembly from RNA-Seq data without a reference genome
    Grabherr, Manfred G.
    Haas, Brian J.
    Yassour, Moran
    Levin, Joshua Z.
    Thompson, Dawn A.
    Amit, Ido
    Adiconis, Xian
    Fan, Lin
    Raychowdhury, Raktima
    Zeng, Qiandong
    Chen, Zehua
    Mauceli, Evan
    Hacohen, Nir
    Gnirke, Andreas
    Rhind, Nicholas
    di Palma, Federica
    Birren, Bruce W.
    Nusbaum, Chad
    Lindblad-Toh, Kerstin
    Friedman, Nir
    Regev, Aviv
    [J]. NATURE BIOTECHNOLOGY, 2011, 29 (07) : 644 - U130
  • [5] TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions
    Kim, Daehwan
    Pertea, Geo
    Trapnell, Cole
    Pimentel, Harold
    Kelley, Ryan
    Salzberg, Steven L.
    [J]. GENOME BIOLOGY, 2013, 14 (04):
  • [6] Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
    Langmead, Ben
    Trapnell, Cole
    Pop, Mihai
    Salzberg, Steven L.
    [J]. GENOME BIOLOGY, 2009, 10 (03):
  • [7] Probabilistic error correction for RNA sequencing
    Le, Hai-Son
    Schulz, Marcel H.
    McCauley, Brenna M.
    Hinman, Veronica F.
    Bar-Joseph, Ziv
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (10) : e109
  • [8] Evaluation of de novo transcriptome assemblies from RNA-Seq data
    Li, Bo
    Fillmore, Nathanael
    Bai, Yongsheng
    Collins, Mike
    Thomson, James A.
    Stewart, Ron
    Dewey, Colin N.
    [J]. GENOME BIOLOGY, 2014, 15 (12):
  • [9] Fast and accurate short read alignment with Burrows-Wheeler transform
    Li, Heng
    Durbin, Richard
    [J]. BIOINFORMATICS, 2009, 25 (14) : 1754 - 1760
  • [10] Martin M., 2011, EMBNET J, V17, P10, DOI [10.14806/ej.17.1.200, DOI 10.14806/EJ.17.1.200]