Omics Pipe: a community-based framework for reproducible multi-omics data analysis

被引:43
作者
Fisch, Kathleen M. [1 ]
Meissner, Tobias [1 ]
Gioia, Louis [1 ]
Ducom, Jean-Christophe [2 ]
Carland, Tristan M. [3 ]
Loguercio, Salvatore [1 ]
Su, Andrew I. [1 ]
机构
[1] Scripps Res Inst, Dept Mol & Expt Med, La Jolla, CA 92037 USA
[2] Scripps Res Inst, La Jolla, CA 92037 USA
[3] J Craig Venter Inst, Dept Human Biol, La Jolla, CA 92037 USA
关键词
DIFFERENTIAL EXPRESSION ANALYSIS; BIOCONDUCTOR PACKAGE; GENE; TOOL;
D O I
10.1093/bioinformatics/btv061
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Omics Pipe (http://sulab.scripps.edu/omicspipe) is a computational framework that automates multi-omics data analysis pipelines on high performance compute clusters and in the cloud. It supports best practice published pipelines for RNA-seq, miRNA-seq, Exome-seq, Whole-Genome sequencing, ChIP-seq analyses and automatic processing of data from The Cancer Genome Atlas (TCGA). Omics Pipe provides researchers with a tool for reproducible, open source and extensible next generation sequencing analysis. The goal of Omics Pipe is to democratize next-generation sequencing analysis by dramatically increasing the accessibility and reproducibility of best practice computational pipelines, which will enable researchers to generate biologically meaningful and interpretable results. Results: Using Omics Pipe, we analyzed 100 TCGA breast invasive carcinoma paired tumor-normal datasets based on the latest UCSC hg19 RefSeq annotation. Omics Pipe automatically downloaded and processed the desired TCGA samples on a high throughput compute cluster to produce a results report for each sample. We aggregated the individual sample results and compared them to the analysis in the original publications. This comparison revealed high overlap between the analyses, as well as novel findings due to the use of updated annotations and methods.
引用
收藏
页码:1724 / 1728
页数:5
相关论文
共 28 条
[1]   HTSeq-a Python']Python framework to work with high-throughput sequencing data [J].
Anders, Simon ;
Pyl, Paul Theodor ;
Huber, Wolfgang .
BIOINFORMATICS, 2015, 31 (02) :166-169
[2]   Count-based differential expression analysis of RNA sequencing data using R and Bioconductor [J].
Anders, Simon ;
McCarthy, Davis J. ;
Chen, Yunshun ;
Okoniewski, Michal ;
Smyth, Gordon K. ;
Huber, Wolfgang ;
Robinson, Mark D. .
NATURE PROTOCOLS, 2013, 8 (09) :1765-1786
[3]   Dysregulation of the basal RNA polymerase transcription apparatus in cancer [J].
Bywater, Megan J. ;
Pearson, Richard B. ;
McArthur, Grant A. ;
Hannan, Ross D. .
NATURE REVIEWS CANCER, 2013, 13 (05) :299-314
[4]   Automated Capture of Experiment Context for Easier Reproducibility in Computational Research [J].
Davison, Andrew P. .
COMPUTING IN SCIENCE & ENGINEERING, 2012, 14 (04) :48-56
[5]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[6]   An integrated encyclopedia of DNA elements in the human genome [J].
Dunham, Ian ;
Kundaje, Anshul ;
Aldred, Shelley F. ;
Collins, Patrick J. ;
Davis, CarrieA. ;
Doyle, Francis ;
Epstein, Charles B. ;
Frietze, Seth ;
Harrow, Jennifer ;
Kaul, Rajinder ;
Khatun, Jainab ;
Lajoie, Bryan R. ;
Landt, Stephen G. ;
Lee, Bum-Kyu ;
Pauli, Florencia ;
Rosenbloom, Kate R. ;
Sabo, Peter ;
Safi, Alexias ;
Sanyal, Amartya ;
Shoresh, Noam ;
Simon, Jeremy M. ;
Song, Lingyun ;
Trinklein, Nathan D. ;
Altshuler, Robert C. ;
Birney, Ewan ;
Brown, James B. ;
Cheng, Chao ;
Djebali, Sarah ;
Dong, Xianjun ;
Dunham, Ian ;
Ernst, Jason ;
Furey, Terrence S. ;
Gerstein, Mark ;
Giardine, Belinda ;
Greven, Melissa ;
Hardison, Ross C. ;
Harris, Robert S. ;
Herrero, Javier ;
Hoffman, Michael M. ;
Iyer, Sowmya ;
Kellis, Manolis ;
Khatun, Jainab ;
Kheradpour, Pouya ;
Kundaje, Anshul ;
Lassmann, Timo ;
Li, Qunhua ;
Lin, Xinying ;
Marinov, Georgi K. ;
Merkel, Angelika ;
Mortazavi, Ali .
NATURE, 2012, 489 (7414) :57-74
[7]   Identifying ChIP-seq enrichment using MACS [J].
Feng, Jianxing ;
Liu, Tao ;
Qin, Bo ;
Zhang, Yong ;
Liu, Xiaole Shirley .
NATURE PROTOCOLS, 2012, 7 (09) :1728-1740
[8]   Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences [J].
Goecks, Jeremy ;
Nekrutenko, Anton ;
Taylor, James .
GENOME BIOLOGY, 2010, 11 (08)
[9]   Unipro UGENE NGS pipelines and components for variant calling, RNA-seq and ChIP-seq data analyses [J].
Golosova, Olga ;
Henderson, Ross ;
Vaskin, Yuriy ;
Gabrielian, Andrei ;
Grekhov, German ;
Nagarajan, Vijayaraj ;
Oler, Andrew J. ;
Nones, Mariam Qui ;
Hurt, Darrell ;
Fursov, Mikhail ;
Huyen, Yentram .
PEERJ, 2014, 2
[10]   Ruffus: a lightweight Python']Python library for computational pipelines [J].
Goodstadt, Leo .
BIOINFORMATICS, 2010, 26 (21) :2778-2779