ChromaPipe: a pipeline for analysis, quality control and management for a DNA sequencing facility

被引:113
作者
Otto, T. D. [1 ,2 ]
Vasconcellos, E. A. [1 ,2 ]
Gomes, L. H. F. [1 ,3 ]
Moreira, A. S. [1 ]
Degrave, W. M. [1 ]
Mendonca-Lima, L. [1 ]
Alves-Ferreira, M. [1 ]
机构
[1] Inst Oswaldo Cruz, FIOCRUZ, Lab Genom Func & Bioinformat, Rio De Janeiro, Brazil
[2] Fundacao Ataulpho Paiva, Rio De Janeiro, Brazil
[3] Univ Fed Rio de Janeiro, Fac Med, BR-21941 Rio De Janeiro, Brazil
关键词
Sequencing pipeline; Chromatogram processing; DNA sequencing;
D O I
10.4238/vol7-3X-Meeting04
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Optimizing and monitoring the data flow in high-throughput sequencing facilities is important for data input and output, for tracking the status of results for the users of the facility, and to guarantee a good, high-quality service. In a multi-user system environment with different throughputs, each user wants to access his/her data easily, track his/her sequencing history, analyze sequences and their quality, and apply some basic post-sequencing analysis, without the necessity of installing further software. Recently, Fiocruz established such a core facility as a "technological platform". Infrastructure includes a 48-capillary 3730 DNA Sequence Analyzer (Applied Biosystems) and supporting equipment. The service includes running samples for large-scale users, performing DNA sequencing reactions and runs for medium and small users, and participation in partial or full genome projects. We implemented a workflow that fulfills these requirements for small and high throughput users. Our implementation also includes the monitoring of data for continuous quality improvement (reports by plate, month and user) by the sequencing staff. For the user, different analyses of the chromatograms, such as visualization of good quality regions, as well as processing, such as comparisons or assemblies, are available. So far, 180 users have made use of the service, generating 155,000 sequences, 35% of which were produced for the BCG Moreau-RJ genome project. The pipeline (named ChromaPipe for Chromatogram Pipeline) is available for download by the scientific community at the url http://bioinfo.pdtis.fiocruz.br/ChromaPipe/. The support for assembly is also configured as a web service: http://bioinfo.pdtis.fiocruz.br/Assembly/.
引用
收藏
页码:861 / 871
页数:11
相关论文
共 31 条
[1]   preAssemble: a tool for automatic sequencer trace data processing [J].
Adzhubei, AA ;
Laerdahl, JK ;
Vlasova, AV .
BMC BIOINFORMATICS, 2006, 7 (1)
[2]   POSA: Perl objects for DNA sequencing data analysis [J].
Aerts, JA ;
Jungerius, BJ ;
Groenen, MA .
BMC GENOMICS, 2004, 5 (1)
[3]   A System for Automated Bacterial (genome) Integrated Annotation - SABIA [J].
Almeida, LGP ;
Paixao, R ;
Souza, RC ;
da Costa, GC ;
Barrientos, FJA ;
dos Santos, MT ;
de Almeida, DF ;
Vasconcelos, ATR .
BIOINFORMATICS, 2004, 20 (16) :2832-2833
[4]  
ALTSCHUL SF, 1997, NUCLEIC ACIDS RES, V25, P3402
[5]  
[Anonymous], CAP3 SEQUENCE ASSEMB
[6]   PipeOnline 2.0: automated EST processing and functional data sorting [J].
Ayoubi, P ;
Jin, XJ ;
Leite, S ;
Liu, XH ;
Martajaja, J ;
Abduraham, A ;
Wan, QL ;
Yan, W ;
Misawa, E ;
Prade, RA .
NUCLEIC ACIDS RESEARCH, 2002, 30 (21) :4761-4769
[7]   GENOTRACE: cDNA-based local GENOme assembly from TRACE archives [J].
Berezikov, E ;
Plasterk, RHA ;
Cuppen, E .
BIOINFORMATICS, 2002, 18 (10) :1396-1397
[8]  
Binneck Eliseu, 2004, Genet Mol Res, V3, P474
[9]   Gene projects: A genome Web tool for ongoing mining and annotation applied to CitEST [J].
Carazzolle, Marcelo F. ;
Formighieri, Eduardo F. ;
Digiampietri, Luciano A. ;
Araujo, Marcos R. R. ;
Costa, Gustavo G. L. ;
Pereira, Goncalo A. G. .
GENETICS AND MOLECULAR BIOLOGY, 2007, 30 (03) :1030-1036
[10]   GARSA:: genomic analysis resources for sequence annotation [J].
Dávila, AMR ;
Lorenzini, DM ;
Mendes, PN ;
Satake, TS ;
Sousa, GR ;
Campos, LM ;
Mazzoni, CJ ;
Wagner, G ;
Pires, PF ;
Grisard, EC ;
Cavalcanti, MCR ;
Campos, MLM .
BIOINFORMATICS, 2005, 21 (23) :4302-4303