OTP: An automatized system for managing and processing NGS data

被引:49
作者
Reisinger, Eva [1 ,2 ]
Genthner, Lena [1 ]
Kerssemakers, Jules [1 ]
Kensche, Philip [1 ]
Borufka, Stefan [1 ]
Jugold, Alke [1 ]
Kling, Andreas [1 ]
Prinz, Manuel [1 ]
Scholz, Ingrid [1 ]
Zipprich, Gideon [1 ]
Eils, Roland [1 ,2 ,3 ,4 ,5 ]
Lawerenz, Christian [1 ]
Eils, Juergen [1 ]
机构
[1] German Canc Res Ctr, Dept Theoret Bioinformat, Heidelberg, Germany
[2] DKFZ, DKFZ HIPO, Heidelberg Ctr Personalized Oncol, Heidelberg, Germany
[3] Heidelberg Univ, Inst Pharm & Mol Biotechnol, Heidelberg, Germany
[4] Heidelberg Univ, Bioquant Ctr, Heidelberg, Germany
[5] Heidelberg Univ, German Ctr Lung Res DZL, Translat Lung Res Ctr Heidelberg TLRC, Heidelberg, Germany
关键词
Next-generation sequencing; Data management; Data processing; Automation; User interface; Standardization; QUALITY-CONTROL; ALIGNMENT;
D O I
10.1016/j.jbiotec.2017.08.006
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The One Touch Pipeline (OTP) is an automation platform managing Next-Generation Sequencing (NGS) data and calling bioinformatic pipelines for processing these data. OTP handles the complete digital process from import of raw sequence data via alignment of sequencing reads to identify genomic events in an automated and scalable way. Three major goals are pursued: firstly, reduction of human resources required for data management by introducing automated processes. Secondly, reduction of time until the sequences can be analyzed by bioinformatic experts, by executing all operations more reliably and quickly. Thirdly, storing all information in one system with secure web access and search capabilities. From software architecture perspective, OTP is both information center and workflow management system. As a workflow management system, OTP call several NGS pipelines that can easily be adapted and extended according to new requirements. As an information center, it comprises a database for metadata information as well as a structured file system. Based on complete and consistent information, data management and bioinformatic pipelines within OTP are executed automatically with all steps book-kept in a database.
引用
收藏
页码:53 / 62
页数:10
相关论文
共 24 条
[1]  
Andrew S., 2010, FASTQC QUALITY CONTR
[2]  
[Anonymous], ACESEQ WORKFLO UNPUB
[3]  
[Anonymous], SOPHIA WORKFLO UNPUB
[4]  
[Anonymous], METHYLCTOOLS UNPUB
[5]  
[Anonymous], RODDY WORKFLOW DEV E
[6]  
[Anonymous], LIBR ACC KINDS CLUST
[7]   Integrated Systems for NGS Data Management and Analysis: Open Issues and Available Solutions [J].
Bianchi, Valerio ;
Ceol, Arnaud ;
Ogier, Alessandro G. E. ;
de Pretis, Stefano ;
Galeota, Eugenia ;
Kishore, Kamal ;
Bora, Pranami ;
Croci, Ottavio ;
Campaner, Stefano ;
Amati, Bruno ;
Morelli, Marco J. ;
Pelizzola, Mattia .
FRONTIERS IN GENETICS, 2016, 7
[8]   Multi-omic data analysis using Galaxy [J].
Boekel, Jorrit ;
Chilton, John M. ;
Cooke, Ira R. ;
Horvatovich, Peter L. ;
Jagtap, Pratik D. ;
Kall, Lukas ;
Lehtio, Janne ;
Lukasse, Pieter ;
Moerland, Perry D. ;
Griffin, Timothy J. .
NATURE BIOTECHNOLOGY, 2015, 33 (02) :137-139
[9]   RNA-SeQC: RNA-seq metrics for quality control and process optimization [J].
DeLuca, David S. ;
Levin, Joshua Z. ;
Sivachenko, Andrey ;
Fennell, Timothy ;
Nazaire, Marc-Danie ;
Williams, Chris ;
Reich, Michael ;
Winckler, Wendy ;
Getz, Gad .
BIOINFORMATICS, 2012, 28 (11) :1530-1532
[10]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21