Machine Learning Algorithms for Classification of MALDI-TOF MS Spectra from Phylogenetically Closely Related Species Brucella melitensis, Brucella abortus and Brucella suis

被引:14
作者
Dematheis, Flavia [1 ]
Walter, Mathias C. [1 ]
Lang, Daniel [1 ]
Antwerpen, Markus [1 ]
Scholz, Holger C. [2 ]
Pfalzgraf, Marie-Theres [1 ]
Mantel, Enrico [1 ]
Hinz, Christin [1 ]
Wolfel, Roman [1 ]
Zange, Sabine [1 ]
机构
[1] Bundeswehr Inst Microbiol, Neuherbergstr 11, D-80937 Munich, Germany
[2] Robert Koch Inst RKI, Ctr Biol Threats & Special Pathogens, Seestr 10, D-13353 Berlin, Germany
关键词
MALDI-TOF MS; Brucella melitensis; B; suis; abortus; machine learning; nested k-fold cross validation; feature selection; R; DESORPTION IONIZATION-TIME; STAPHYLOCOCCUS-AUREUS; MASS-SPECTROMETRY; FEATURE-SELECTION; R-PACKAGE; IDENTIFICATION; DISCRIMINATION;
D O I
10.3390/microorganisms10081658
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
(1) Background: MALDI-TOF mass spectrometry (MS) is the gold standard for microbial fingerprinting, however, for phylogenetically closely related species, the resolution power drops down to the genus level. In this study, we analyzed MALDI-TOF spectra from 44 strains of B. melitensis, B. suis and B. abortus to identify the optimal classification method within popular supervised and unsupervised machine learning (ML) algorithms. (2) Methods: A consensus feature selection strategy was applied to pinpoint from among the 500 MS features those that yielded the best ML model and that may play a role in species differentiation. Unsupervised k-means and hierarchical agglomerative clustering were evaluated using the silhouette coefficient, while the supervised classifiers Random Forest, Support Vector Machine, Neural Network, and Multinomial Logistic Regression were explored in a fine-tuning manner using nested k-fold cross validation (CV) with a feature reduction step between the two CV loops. (3) Results: Sixteen differentially expressed peaks were identified and used to feed ML classifiers. Unsupervised and optimized supervised models displayed excellent predictive performances with 100% accuracy. The suitability of the consensus feature selection strategy for learning system accuracy was shown. (4) Conclusion: A meaningful ML approach is here introduced, to enhance Brucella spp. classification using MALDI-TOF MS data.
引用
收藏
页数:14
相关论文
共 29 条
[1]   Evaluation of genus-specific and species-specific real-time PCR assays for the identification of Brucella spp. [J].
Al Dahouk, Sascha ;
Noeckler, Karsten ;
Scholz, Holger C. ;
Pfeffer, Martin ;
Neubauer, Heinrich ;
Tomaso, Herbert .
CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2007, 45 (11) :1464-1470
[2]   Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls [J].
Arbabshirani, Mohammad R. ;
Plis, Sergey ;
Sui, Jing ;
Calhoun, Vince D. .
NEUROIMAGE, 2017, 145 :137-165
[3]   Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87
[4]   Identification of Brucella by MALDI-TOF Mass Spectrometry. Fast and Reliable Identification from Agar Plates and Blood Cultures [J].
Ferreira, Laura ;
Vega Castano, Silvia ;
Sanchez-Juanes, Fernando ;
Gonzalez-Cabrero, Sandra ;
Menegotto, Fabiola ;
Orduna-Domingo, Antonio ;
Manuel Gonzalez-Buitrago, Jose ;
Luis Munoz-Bellido, Juan .
PLOS ONE, 2010, 5 (12)
[5]   Whole genome sequencing of Brucella melitensis isolated from 57 patients in Germany reveals high diversity in strains from Middle East [J].
Georgi, Enrico ;
Walter, Mathias C. ;
Pfalzgraf, Marie-Theres ;
Northoff, Bernd H. ;
Holdt, Lesca M. ;
Scholz, Holger C. ;
Zoeller, Lothar ;
Zange, Sabine ;
Antwerpen, Markus H. .
PLOS ONE, 2017, 12 (04)
[6]   MALDIquant: a versatile R package for the analysis of mass spectrometry data [J].
Gibb, Sebastian ;
Strimmer, Korbinian .
BIOINFORMATICS, 2012, 28 (17) :2270-2271
[7]   A guide to machine learning for biologists [J].
Greener, Joe G. ;
Kandathil, Shaun M. ;
Moffat, Lewis ;
Jones, David T. .
NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2022, 23 (01) :40-55
[8]  
Guyon I., 2003, Journal of Machine Learning Research, V3, P1157, DOI 10.1162/153244303322753616
[9]   Current status of MALDI-TOF mass spectrometry in clinical microbiology [J].
Hou, Tsung-Yun ;
Chuan Chiang-Ni ;
Teng, Shih-Hua .
JOURNAL OF FOOD AND DRUG ANALYSIS, 2019, 27 (02) :404-414
[10]   Automated Bacterial Classifications Using Machine Learning Based Computational Techniques: Architectures, Challenges and Open Research Issues [J].
Kotwal, Shallu ;
Rani, Priya ;
Arif, Tasleem ;
Manhas, Jatinder ;
Sharma, Sparsh .
ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2022, 29 (04) :2469-2490