Hierarchical classification of microorganisms based on high-dimensional phenotypic data

被引:23
作者
Tafintseva, Valeria [1 ]
Vigneau, Evelyne [2 ]
Shapaval, Volha [1 ]
Cariou, Veronique [2 ]
Qannari, El Mostafa [2 ]
Kohler, Achim [1 ]
机构
[1] Norwegian Univ Life Sci, Fac Sci & Technol, N-1432 As, Norway
[2] INRA, Oniris, StatSC, Nantes, France
关键词
classification analysis; FTIR spectroscopy of microorganisms; hierarchical tree structure; TRANSFORM INFRARED-SPECTROSCOPY; PARTIAL LEAST-SQUARES; FT-IR SPECTROSCOPY; RAPID IDENTIFICATION; VARIABLE SELECTION; BACTERIA; DIFFERENTIATION; SPARSE; TOOL;
D O I
10.1002/jbio.201700047
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The classification of microorganisms by high-dimensional phenotyping methods such as FTIR spectroscopy is often a complicated process due to the complexity of microbial phylogenetic taxonomy. A hierarchical structure developed for such data can often facilitate the classification analysis. The hierarchical tree structure can either be imposed to a given set of phenotypic data by integrating the phylogenetic taxonomic structure or set up by revealing the inherent clusters in the phenotypic data. In this study, we wanted to compare different approaches to hierarchical classification of microorganisms based on high-dimensional phenotypic data. A set of 19 different species of molds (filamentous fungi) obtained from the mycological strain collection of the Norwegian Veterinary Institute (Oslo, Norway) is used for the study. Hierarchical cluster analysis is performed for setting up the classification trees. Classification algorithms such as artificial neural networks (ANN), partial least-squared discriminant analysis and random forest (RF) are used and compared. The 2 methods ANN and RF outperformed all the other approaches even though they did not utilize predefined hierarchical structure. To our knowledge, the RF approach is used here for the first time to classify microorganisms by FTIR spectroscopy.
引用
收藏
页数:10
相关论文
共 49 条
[1]  
Amiel C, 2001, LAIT, V81, P249, DOI 10.1051/lait:2001128
[2]   Reducing over-optimism in variable selection by cross-model validation [J].
Anderssen, Endre ;
Dyrstad, Knut ;
Westad, Frank ;
Martens, Harald .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2006, 84 (1-2) :69-74
[3]  
[Anonymous], 1996, OUT OF BAG ESTIMATIO
[4]  
[Anonymous], 1973, Pattern Classification and Scene Analysis
[5]   Partial least squares for discrimination [J].
Barker, M ;
Rayens, W .
JOURNAL OF CHEMOMETRICS, 2003, 17 (03) :166-173
[6]   Fourier transform infrared spectroscopy for rapid identification of nonfermenting gram-negative bacteria isolated from sputum samples from cystic fibrosis patients [J].
Bosch, Alejandra ;
Minan, Alejandro ;
Vescina, Cecilia ;
Degrossi, Jose ;
Gatti, Blanca ;
Montanaro, Patricia ;
Messina, Matias ;
Franco, Mirta ;
Vay, Carlos ;
Schmitt, Juergen ;
Naumann, Dieter ;
Yantorno, Osvaldo .
JOURNAL OF CLINICAL MICROBIOLOGY, 2008, 46 (08) :2535-2546
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]   Use of Fourier transform infrared spectroscopy and chemometrics to discriminate clinical isolates of bacteria of the Burkholderia cepacia complex from different species and ribopatterns [J].
Coutinho, Carla Patricia ;
Sa-Correia, Isabel ;
Lopes, Joao Almeida .
ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2009, 394 (08) :2161-2171
[10]   Applications of MALDI-TOF mass spectrometry in clinical diagnostic microbiology [J].
Croxatto, Antony ;
Prod'hom, Guy ;
Greub, Gilbert .
FEMS MICROBIOLOGY REVIEWS, 2012, 36 (02) :380-407