Unipro UGENE NGS pipelines and components for variant calling, RNA-seq and ChIP-seq data analyses

被引:67
作者
Golosova, Olga [1 ]
Henderson, Ross [2 ]
Vaskin, Yuriy [1 ]
Gabrielian, Andrei [2 ]
Grekhov, German [1 ]
Nagarajan, Vijayaraj [2 ]
Oler, Andrew J. [2 ]
Nones, Mariam Qui [2 ]
Hurt, Darrell [2 ]
Fursov, Mikhail [1 ]
Huyen, Yentram [2 ]
机构
[1] Unipro Ctr Informat Technol, Novosibirsk, Russia
[2] NIAID, Bioinformat & Computat Biosci Branch, Off Cyber Infrastruct & Computat Biol, NIH, Bethesda, MD 20892 USA
来源
PEERJ | 2014年 / 2卷
关键词
Bioinformatics; Next-generation sequencing; Data analysis; ChIP-seq; Variant calling; RNA-seq; READ ALIGNMENT; WORKFLOWS; TOPHAT; GENE;
D O I
10.7717/peerj.644
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The advent of Next Generation Sequencing (NGS) technologies has opened new possibilities for researchers. However, the more biology becomes a data-intensive field, the more biologists have to learn how to process and analyze NGS data with complex computational tools. Even with the availability of common pipeline specifications, it is often a time-consuming and cumbersome task for a bench scientist to install and configure the pipeline tools. We believe that a unified, desktop and biologist-friendly front end to NGS data analysis tools will substantially improve productivity in this field. Here we present NGS pipelines "Variant Calling with SAMtools", "Tuxedo Pipeline for RNA-seq Data Analysis" and "Cistrome Pipeline for ChIP-seq Data Analysis" integrated into the Unipro UGENE desktop toolkit. We describe the available UGENE infrastructure that helps researchers run these pipelines on different datasets, store and investigate the results and re-run the pipelines with the same parameters. These pipeline tools are included in the UGENE NGS package. Individual blocks of these pipelines are also available for expert users to create their own advanced workflows.
引用
收藏
页数:15
相关论文
共 17 条
  • [1] myExperiment: a repository and social network for the sharing of bioinformatics workflows
    Goble, Carole A.
    Bhagat, Jiten
    Aleksejevs, Sergejs
    Cruickshank, Don
    Michaelides, Danius
    Newman, David
    Borkum, Mark
    Bechhofer, Sean
    Roos, Marco
    Li, Peter
    De Roure, David
    [J]. NUCLEIC ACIDS RESEARCH, 2010, 38 : W677 - W682
  • [2] Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences
    Goecks, Jeremy
    Nekrutenko, Anton
    Taylor, James
    [J]. GENOME BIOLOGY, 2010, 11 (08):
  • [3] Langmead B, 2012, NAT METHODS, V9, P357, DOI [10.1038/NMETH.1923, 10.1038/nmeth.1923]
  • [4] Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
    Langmead, Ben
    Trapnell, Cole
    Pop, Mihai
    Salzberg, Steven L.
    [J]. GENOME BIOLOGY, 2009, 10 (03):
  • [5] Fast and accurate short read alignment with Burrows-Wheeler transform
    Li, Heng
    Durbin, Richard
    [J]. BIOINFORMATICS, 2009, 25 (14) : 1754 - 1760
  • [6] Cistrome: an integrative platform for transcriptional regulation studies
    Liu, Tao
    Ortiz, Jorge A.
    Taing, Len
    Meyer, Clifford A.
    Lee, Bernett
    Zhang, Yong
    Shin, Hyunjin
    Wong, Swee S.
    Ma, Jian
    Lei, Ying
    Pape, Utz J.
    Poidinger, Michael
    Chen, Yiwen
    Yeung, Kevin
    Brown, Myles
    Turpaz, Yaron
    Liu, X. Shirley
    [J]. GENOME BIOLOGY, 2011, 12 (08):
  • [7] An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments
    Liu, XS
    Brutlag, DL
    Liu, JS
    [J]. NATURE BIOTECHNOLOGY, 2002, 20 (08) : 835 - 839
  • [8] TRANSFAC® and its module TRANSCompel®:: transcriptional gene regulation in eukaryotes
    Matys, V.
    Kel-Margoulis, O. V.
    Fricke, E.
    Liebich, I.
    Land, S.
    Barre-Dirrie, A.
    Reuter, I.
    Chekmenev, D.
    Krull, M.
    Hornischer, K.
    Voss, N.
    Stegmaier, P.
    Lewicki-Potapov, B.
    Saxel, H.
    Kel, A. E.
    Wingender, E.
    [J]. NUCLEIC ACIDS RESEARCH, 2006, 34 : D108 - D110
  • [9] Unipro UGENE: a unified bioinformatics toolkit
    Okonechnikov, Konstantin
    Golosova, Olga
    Fursov, Mikhail
    [J]. BIOINFORMATICS, 2012, 28 (08) : 1166 - 1167
  • [10] JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles
    Portales-Casamar, Elodie
    Thongjuea, Supat
    Kwon, Andrew T.
    Arenillas, David
    Zhao, Xiaobei
    Valen, Eivind
    Yusuf, Dimas
    Lenhard, Boris
    Wasserman, Wyeth W.
    Sandelin, Albin
    [J]. NUCLEIC ACIDS RESEARCH, 2010, 38 : D105 - D110