Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data

被引:130
作者
Clausen, Philip T. L. C. [1 ]
Zankari, Ea [1 ,2 ]
Aarestrup, Frank M. [2 ]
Lund, Ole [1 ]
机构
[1] Tech Univ Denmark, Ctr Biol Sequence Anal, Dept Syst Biol, DK-2800 Lyngby, Denmark
[2] Tech Univ Denmark, Res Grp Genom Epidemiol, Natl Food Inst, DK-2800 Lyngby, Denmark
关键词
SURVEILLANCE;
D O I
10.1093/jac/dkw184
中图分类号
R51 [传染病];
学科分类号
100401 ;
摘要
Next generation sequencing (NGS) may be an alternative to phenotypic susceptibility testing for surveillance and clinical diagnosis. However, current bioinformatics methods may be associated with false positives and negatives. In this study, a novel mapping method was developed and benchmarked to two different methods in current use for identification of antibiotic resistance genes in bacterial WGS data. A novel method, KmerResistance, which examines the co-occurrence of k-mers between the WGS data and a database of resistance genes, was developed. The performance of this method was compared with two previously described methods; ResFinder and SRST2, which use an assembly/BLAST method and BWA, respectively, using two datasets with a total of 339 isolates, covering five species, originating from the Oxford University Hospitals NHS Trust and Danish pig farms. The predicted resistance was compared with the observed phenotypes for all isolates. To challenge further the sensitivity of the in silico methods, the datasets were also down-sampled to 1% of the reads and reanalysed. The best results were obtained by identification of resistance genes by mapping directly against the raw reads. This indicates that information might be lost during assembly. KmerResistance performed significantly better than the other methods, when data were contaminated or only contained few sequence reads. Read mapping is superior to assembly-based methods and the new KmerResistance seemingly outperforms currently available methods particularly when including datasets with few reads.
引用
收藏
页码:2484 / 2488
页数:5
相关论文
共 16 条
[1]  
[Anonymous], FUNDAMENTALS MODERN
[2]  
[Anonymous], 2006, Introduction to Data Mining
[3]   SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing [J].
Bankevich, Anton ;
Nurk, Sergey ;
Antipov, Dmitry ;
Gurevich, Alexey A. ;
Dvorkin, Mikhail ;
Kulikov, Alexander S. ;
Lesin, Valery M. ;
Nikolenko, Sergey I. ;
Son Pham ;
Prjibelski, Andrey D. ;
Pyshkin, Alexey V. ;
Sirotkin, Alexander V. ;
Vyahhi, Nikolay ;
Tesler, Glenn ;
Alekseyev, Max A. ;
Pevzner, Pavel A. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) :455-477
[4]  
ECDC, 2014, ANN EP REP ANT RES H
[5]   Rapid Whole-Genome Sequencing for Detection and Characterization of Microorganisms Directly from Clinical Samples [J].
Hasman, Henrik ;
Saputra, Dhany ;
Sicheritz-Ponten, Thomas ;
Lund, Ole ;
Svendsen, Christina Aaby ;
Frimodt-Moller, Niels ;
Aarestrup, Frank M. .
JOURNAL OF CLINICAL MICROBIOLOGY, 2014, 52 (01) :139-146
[6]   SRST2: Rapid genomic surveillance for public health and hospital microbiology labs [J].
Inouye, Michael ;
Dashnow, Harriet ;
Raven, Lesley-Ann ;
Schultz, Mark B. ;
Pope, Bernard J. ;
Tomita, Takehiro ;
Zobel, Justin ;
Holt, Kathryn E. .
GENOME MEDICINE, 2014, 6
[7]  
Langmead B, 2012, NAT METHODS, V9, P357, DOI [10.1038/NMETH.1923, 10.1038/nmeth.1923]
[8]   Benchmarking of Methods for Genomic Taxonomy [J].
Larsen, Mette V. ;
Cosentino, Salvatore ;
Lukjancenko, Oksana ;
Saputra, Dhany ;
Rasmussen, Simon ;
Hasman, Henrik ;
Sicheritz-Ponten, Thomas ;
Aarestrup, Frank M. ;
Ussery, David W. ;
Lund, Ole .
JOURNAL OF CLINICAL MICROBIOLOGY, 2014, 52 (05) :1529-1539
[9]   Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences [J].
Li, Weizhong ;
Godzik, Adam .
BIOINFORMATICS, 2006, 22 (13) :1658-1659
[10]  
Lund O., 2005, Immunological bioinformatics