Alfred: interactive multi-sample BAM alignment statistics, feature counting and feature annotation for long- and short-read sequencing

被引：52

作者：

Rausch, Tobias ^{[1
,2
]}

Fritz, Markus Hsi-Yang ^{[2
]}

Korbel, Jan O. ^{[2
]}

Benes, Vladimir ^{[1
]}

机构：

[1] EMBL, Genom Core Facil, Meyerhofstr 1, D-69117 Heidelberg, Germany

[2] EMBL, Genome Biol Unit, Meyerhofstr 1, D-69117 Heidelberg, Germany

来源：

BIOINFORMATICS | 2019年 / 35卷 / 14期

关键词：

QUALITY-CONTROL;

D O I：

10.1093/bioinformatics/bty1007

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Harmonizing quality control (QC) of large-scale second and third-generation sequencing datasets is key for enabling downstream computational and biological analyses. We present Alfred, an efficient and versatile command-line application that computes multi-sample QC metrics in a read-group aware manner, across a wide variety of sequencing assays and technologies. In addition to standard QC metrics such as GC bias, base composition, insert size and sequencing coverage distributions it supports haplotype-aware and allele-specific feature counting and feature annotation. The versatility of Alfred allows for easy pipeline integration in high-throughput settings, including DNA sequencing facilities and large-scale research initiatives, enabling continuous monitoring of sequence data quality and characteristics across samples. Alfred supports haplo-tagging of BAM/CRAM files to conduct haplotype-resolved analyses in conjunction with a variety of next-generation sequencing based assays. Alfred's companion web application enables interactive exploration of results and comparison to public datasets. Availability and implementation Alfred is open-source and freely available at https://tobiasrausch.com/alfred/. Supplementary information Supplementary data are available at Bioinformatics online.

引用

页码：2489 / 2491

页数：3

共 11 条

[1]

[Anonymous], BOOST C LIB

[2] RNA-SeQC: RNA-seq metrics for quality control and process optimization [J].

DeLuca, David S. ;

Levin, Joshua Z. ;

Sivachenko, Andrey ;

Fennell, Timothy ;

Nazaire, Marc-Danie ;

Williams, Chris ;

Reich, Michael ;

Winckler, Wendy ;

Getz, Gad .

BIOINFORMATICS, 2012, 28 (11) :1530-1532

[3] CHANCE: comprehensive software for quality control and validation of ChIP-seq data [J].

Diaz, Aaron ;

Nellore, Abhinav ;

Song, Jun S. .

GENOME BIOLOGY, 2012, 13 (10) :R98

[4] Standardization and quality management in next-generation sequencing [J].

Endrullat, Christoph ;

Gloekler, Joern ;

Franke, Philipp ;

Frohme, Marcus .

APPLIED AND TRANSLATIONAL GENOMICS, 2016, 10 :2-9

[5] Efficient storage of high throughput DNA sequencing data using reference-based compression [J].

Fritz, Markus Hsi-Yang ;

Leinonen, Rasko ;

Cochrane, Guy ;

Birney, Ewan .

GENOME RESEARCH, 2011, 21 (05) :734-740

[6] Fast and accurate short read alignment with Burrows-Wheeler transform [J].

Li, Heng ;

Durbin, Richard .

BIOINFORMATICS, 2009, 25 (14) :1754-1760

[7] Poretools: a toolkit for analyzing nanopore sequence data [J].

Loman, Nicholas J. ;

Quinlan, Aaron R. .

BIOINFORMATICS, 2014, 30 (23) :3399-3401

[8] Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data [J].

Okonechnikov, Konstantin ;

Conesa, Ana ;

Garcia-Alcalde, Fernando .

BIOINFORMATICS, 2016, 32 (02) :292-294

[9] NGS QC Toolkit: A Toolkit for Quality Control of Next Generation Sequencing Data [J].

Patel, Ravi K. ;

Jain, Mukesh .

PLOS ONE, 2012, 7 (02)

[10] Dense and accurate whole-chromosome haplotyping of individual genomes [J].

Porubsky, David ;

Garg, Shilpa ;

Sanders, Ashley D. ;

Korbel, Jan O. ;

Guryev, Victor ;

Lansdorp, Peter M. ;

Marschall, Tobias .

NATURE COMMUNICATIONS, 2017, 8

← 1 2 →