Genomics and Machine Learning for Taxonomy Consensus: The Mycobacterium tuberculosis Complex Paradigm

被引:23
作者
Aze, Jerome [1 ]
Sola, Christophe [2 ]
Zhang, Jian [2 ]
Lafosse-Marin, Florian [2 ]
Yasmin, Memona [3 ,4 ]
Siddiqui, Rubina [4 ]
Kremer, Kristin [5 ]
van Soolingen, Dick [5 ,6 ,7 ]
Refregier, Guislaine [2 ]
机构
[1] LIRMM UM CNRS, UMR 5506, F-34095 Montpellier 2, France
[2] Univ Paris 11, CNRS, CEA, Inst Integrat Biol Cell I2BC, F-91405 Orsay, France
[3] PIEAS, Islamabad, Pakistan
[4] NIBGE, Hlth Biotechnol Div, Faisalabad, Pakistan
[5] Natl Inst Publ Hlth & Environm, NL-3720 BA Bilthoven, Netherlands
[6] Radbout Univ Nijmegen, Univ Lung Ctr Dekkerswald, Med Ctr, Dept Pulm Dis, NL-6500 HB Nijmegen, Netherlands
[7] Radbout Univ Nijmegen, Univ Lung Ctr Dekkerswald, Med Ctr, Dept Microbiol, NL-6500 HB Nijmegen, Netherlands
来源
PLOS ONE | 2015年 / 10卷 / 07期
关键词
VARIABLE-NUMBER; GENETIC DIVERSITY; MOLECULAR EPIDEMIOLOGY; STRAIN DIFFERENTIATION; DISCRIMINATORY POWER; GLOBAL DISTRIBUTION; DNA POLYMORPHISM; CLASSIFICATION; HYBRIDIZATION; TOOL;
D O I
10.1371/journal.pone.0130912
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Infra-species taxonomy is a prerequisite to compare features such as virulence in different pathogen lineages. Mycobacterium tuberculosis complex taxonomy has rapidly evolved in the last 20 years through intensive clinical isolation, advances in sequencing and in the description of fast-evolving loci (CRISPR and MIRU-VNTR). On-line tools to describe new isolates have been set up based on known diversity either on CRISPRs (also known as spoligotypes) or on MIRU-VNTR profiles. The underlying taxonomies are largely concordant but use different names and offer different depths. The objectives of this study were 1) to explicit the consensus that exists between the alternative taxonomies, and 2) to provide an on-line tool to ease classification of new isolates. Genotyping (24-VNTR, 43-spacers spoligotypes, IS6110-RFLP) was undertaken for 3,454 clinical isolates from the Netherlands (2004-2008). The resulting database was enlarged with African isolates to include most human tuberculosis diversity. Assignations were obtained using TB-Lineage, MIRU-VNTRPlus, SITVITWEB and an algorithm from Borile et al. By identifying the recurrent concordances between the alternative taxonomies, we proposed a consensus including 22 sublineages. Original and consensus assignations of the all isolates from the database were subsequently implemented into an ensemble learning approach based on Machine Learning tool Weka to derive a classification scheme. All assignations were reproduced with very good sensibilities and specificities. When applied to independent datasets, it was able to suggest new sublineages such as pseudo-Beijing. This Lineage Prediction tool, efficient on 15-MIRU, 24-VNTR and spoligotype data is available on the web interface "TBminer." Another section of this website helps summarizing key molecular epidemiological data, easing tuberculosis surveillance. Altogether, we successfully used Machine Learning on a large dataset to set up and make available the first consensual taxonomy for human Mycobacterium tuberculosis complex. Additional developments using SNPs will help stabilizing it.
引用
收藏
页数:24
相关论文
共 85 条
[1]   Resolving lineage assignation on Mycobacterium tuberculosis clinical isolates classified by spoligotyping with a new high-throughput 3R SNPs based method [J].
Abadia, Edgar ;
Zhang, Jian ;
dos Vultos, Tiago ;
Ritacco, Viviana ;
Kremer, Kristin ;
Aktas, Elif ;
Matsumoto, Tomoshige ;
Refregier, Guislaine ;
van Soolingen, Dick ;
Gicquel, Brigitte ;
Sola, Christophe .
INFECTION GENETICS AND EVOLUTION, 2010, 10 (07) :1066-1074
[2]   Novel Mycobacterium tuberculosis Complex Pathogen, M. mungi [J].
Alexander, Kathleen A. ;
Laver, Pete N. ;
Michel, Anita L. ;
Williams, Mark ;
van Helden, Paul D. ;
Warren, Robin M. ;
Gey van Pittius, Nicolaas C. .
EMERGING INFECTIOUS DISEASES, 2010, 16 (08) :1296-1299
[3]   Three-year population-based evaluation of standardized mycobacterial interspersed repetitive-unit-variable-number tandem-repeat typing of Mycobacterium tuberculosis [J].
Allix-Beguec, Caroline ;
Fauville-Dufaux, Maryse ;
Supply, Philip .
JOURNAL OF CLINICAL MICROBIOLOGY, 2008, 46 (04) :1398-1406
[4]  
ALLIXBEGUEC C, 2008, J CLIN MICROBIOL
[5]   A Systematic Comparison of Supervised Classifiers [J].
Amancio, Diego Raphael ;
Comin, Cesar Henrique ;
Casanova, Dalcimar ;
Travieso, Gonzalo ;
Bruno, Odemir Martinez ;
Rodrigues, Francisco Aparecido ;
Costa, Luciano da Fontoura .
PLOS ONE, 2014, 9 (04)
[6]   Individual classification of children with epilepsy using support vector machine with multiple indices of diffusion tensor imaging [J].
Amarreh, Ishmael ;
Meyerand, Mary E. ;
Stafstrom, Carl ;
Hermann, Bruce P. ;
Birn, Rasmus M. .
NEUROIMAGE-CLINICAL, 2014, 4 :757-764
[7]   Mycobacterium tuberculosis subsp caprae subsp nov.:: a taxonomic study of a new member of the Mycobacterium tuberculosis complex isolated from goats in Spain [J].
Aranaz, A ;
Liébana, E ;
Gómez-Mampaso, E ;
Galán, JC ;
Cousins, D ;
Ortega, A ;
Blázquez, J ;
Baquero, F ;
Mateos, A ;
Súarez, G ;
Domínguez, L .
INTERNATIONAL JOURNAL OF SYSTEMATIC BACTERIOLOGY, 1999, 49 :1263-1273
[8]   Progenitor "Mycobacterium canettii" Clone Responsible for Lymph Node Tuberculosis Epidemic, Djibouti [J].
Blouin, Yann ;
Cazajous, Geraldine ;
Dehan, Celine ;
Soler, Charles ;
Vong, Rithy ;
Hassan, Mohamed Osman ;
Hauck, Yolande ;
Boulais, Christian ;
Andriamanantena, Dina ;
Martinaud, Christophe ;
Martin, Emilie ;
Pourcel, Christine ;
Vergnaud, Gilles .
EMERGING INFECTIOUS DISEASES, 2014, 20 (01) :21-28
[9]   Significance of the Identification in the Horn of Africa of an Exceptionally Deep Branching Mycobacterium tuberculosis Clade [J].
Blouin, Yann ;
Hauck, Yolande ;
Soler, Charles ;
Fabre, Michel ;
Vong, Rithy ;
Dehan, Celine ;
Cazajous, Geraldine ;
Massoure, Pierre-Laurent ;
Kraemer, Philippe ;
Jenkins, Akinbowale ;
Garnotel, Eric ;
Pourcel, Christine ;
Vergnaud, Gilles .
PLOS ONE, 2012, 7 (12)
[10]   Tuberculosis elimination in the Netherlands [J].
Borgdorff, MW ;
van der Werf, MJ ;
de Haas, PEW ;
Kremer, K ;
van Soolingen, D .
EMERGING INFECTIOUS DISEASES, 2005, 11 (04) :597-602