Butler enables rapid cloud-based analysis of thousands of human genomes

被引:0
作者
Sergei Yakneen
Sebastian M. Waszak
Michael Gertz
Jan O. Korbel
机构
[1] European Molecular Biology Laboratory (EMBL),Institute of Computer Science
[2] Genome Biology Unit,Department of Medical Oncology
[3] Heidelberg University,Department of Biomolecular Engineering
[4] EMBL,UC Santa Cruz Genomics Institute
[5] European Bioinformatics Institute (EMBL-EBI),Biomedical Engineering
[6] Sophia Genetics SA,Institute of Pharmacy and Molecular Biotechnology and BioQuant
[7] Genome Informatics Program,Department of Haematology
[8] Ontario Institute for Cancer Research,Department of Biomedical Data Science
[9] Barcelona Supercomputing Center (BSC),Department of Genetics
[10] Laboratory for Medical Science Mathematics,Department of Biochemistry and Molecular Medicine
[11] RIKEN Center for Integrative Medical Sciences,CIBIO/InBIO—Research Center in Biodiversity and Genetic Resources
[12] RIKEN Center for Integrative Medical Sciences,Department Biochemistry and Molecular Biomedicine
[13] Broad Institute of MIT and Harvard,Department of Pathology
[14] Dana-Farber Cancer Institute,Department of Medicine, Section of Hematology/Oncology
[15] University of California Santa Cruz,Department of Pediatric Immunology, Hematology and Oncology
[16] University of California Santa Cruz,Institute of Medical Science
[17] Oregon Health and Science University,Department of Biochemistry
[18] Division of Theoretical Bioinformatics,Health Sciences Department of Biomedical Informatics
[19] German Cancer Research Center (DKFZ),Department of Health Sciences and Technology
[20] Heidelberg Center for Personalized Oncology (DKFZ-HIPO),Center for Biomolecular Science and Engineering
[21] German Cancer Research Center,Department of Cell and Systems Biology
[22] Heidelberg University,Department of Radiation Oncology
[23] Wellcome Sanger Institute,Institute for Genomics and Systems Biology
[24] Wellcome Genome Campus,Department of Molecular Genetics
[25] University of Cambridge,Computational Biology Program
[26] University of California San Diego,Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences
[27] PDXen Biosystems Inc,Finsen Laboratory and Biotech Research & Innovation Centre (BRIC)
[28] Electronics and Telecommunications Research Institute,Department of Urology
[29] Seven Bridges Genomics,Department of Biological Oceanography
[30] Annai Systems,Genome Science Division, Research Center for Advanced Science and Technology
[31] Inc,Department of Surgery
[32] Stanford University School of Medicine,Department of Surgery, Division of Hepatobiliary and Pancreatic Surgery, School of Medicine
[33] Stanford University School of Medicine,Department of Oncology, Gil Medical Center
[34] University of Leuven,Department of Bioinformatics and Computational Biology
[35] The Francis Crick Institute,Bioinformatics Core Facility
[36] Computational Biology Program,Heinrich Pette Institute
[37] Ontario Institute for Cancer Research,Ontario Tumour Bank
[38] The Hospital for Sick Children,Department of Pathology
[39] Heidelberg University,Laboratory of Pathology, Center for Cancer Research
[40] New BIH Digital Health Center,Department of Cellular and Molecular Medicine and Department of Bioengineering
[41] Berlin Institute of Health (BIH) and Charité – Universitätsmedizin Berlin,Sir Peter MacCallum Department of Oncology, Peter MacCallum Cancer Centre
[42] Rigshospitalet,Centre for Research in Molecular Medicine and Chronic Diseases (CiMUS)
[43] University of Montreal,Department of Zoology, Genetics and Physical Anthropology
[44] Universidade do Porto,The Biomedical Research Centre (CINBIO)
[45] University of Barcelona,Department of Genomic Medicine
[46] Center for Cancer Research,Quantitative and Computational Biosciences Graduate Program
[47] Massachusetts General Hospital,Genome Informatics Program
[48] Massachusetts General Hospital,Institute of Human Genetics
[49] Harvard Medical School,Institute of Human Genetics
[50] University of Chicago,Queensland Centre for Medical Genomics, Institute for Molecular Bioscience
来源
Nature Biotechnology | 2020年 / 38卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
We present Butler, a computational tool that facilitates large-scale genomic analyses on public and academic clouds. Butler includes innovative anomaly detection and self-healing functions that improve the efficiency of data processing and analysis by 43% compared with current approaches. Butler enabled processing of a 725-terabyte cancer genome dataset from the Pan-Cancer Analysis of Whole Genomes (PCAWG) project in a time-efficient and uniform manner.
引用
收藏
页码:288 / 292
页数:4
相关论文
共 27 条
[1]  
Habermann N(2016)Using large-scale genome variation cohorts to decipher the molecular mechanism of cancer C. R. Biol. 339 308-313
[2]  
Mardin BR(2017)Nextflow enables reproducible computational workflows Nat. Biotechnol. 35 316-319
[3]  
Yakneen S(2017)Toil enables reproducible, open source, big biomedical data analyses Nat. Biotechnol. 35 314-316
[4]  
Korbel JO(2017)GenomeVIP: a cloud platform for genomic variant discovery and interpretation Genome Res. 27 1450-1459
[5]  
Di Tommaso P(2015)Data analysis: create a cloud commons Nature 523 149-151
[6]  
Vivian J(2017)Computing patient data in the cloud: practical and legal considerations for genetics and genomics research in Europe and internationally Genome Med. 9 303-536
[7]  
Paten B(2014)Rampant software errors may undermine scientific results F1000 Res. 3 530-1760
[8]  
Mashl RJ(2017)A review of bioinformatic pipeline frameworks Brief. Bioinformatics 18 2-i339
[9]  
Stein LD(2014)Docker: lightweight Linux containers for consistent development and deployment Linux J. 2014 1754-15.7.12
[10]  
Knoppers BM(2009)Fast and accurate short read alignment with Burrows-Wheeler transform Bioinformatics 25 i333-54