Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae

被引：1

作者：

Biffignandi, Gherard Batisti ^{[1
,2
,3
]}

Chindelevitch, Leonid ^{[2
]}

Corbella, Marta ^{[4
]}

Feil, Edward J. ^{[5
]}

Sassera, Davide ^{[1
,6
]}

Lees, John A. ^{[3
]}

机构：

[1] Univ Pavia, Dept Biol & Biotechnol, Pavia, Italy

[2] Imperial Coll London, MRC Ctr Global Infect Dis Anal, London, England

[3] European Bioinformat Inst, European Mol Biol Lab, Wellcome Genome Campus, Hinxton, England

[4] Fdn IRCCS Policlin San Matteo, Microbiol & Virol Unit, Pavia, Italy

[5] Univ Bath, Milner Ctr Evolut, Dept Life Sci, Bath, England

[6] Fdn IRCCS Policlin San Matteo, Pavia, Italy

来源：

MICROBIAL GENOMICS | 2024年 / 10卷 / 03期

基金：

英国医学研究理事会;

关键词：

AMR; antibiotic resistance; bacterial genomics; GWAS; Klebsiella pneumoniae; machine learning; MIC; REGULARIZATION; SELECTION; MODELS;

D O I：

10.1099/mgen.0.001222

中图分类号：

Q3 [遗传学];

学科分类号：

071007 ; 090102 ;

摘要：

Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab- based MIC determination can be time- consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow in silico MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi- quantitatively, with varying resolution, and are typically also left- and right- censored within varying ranges. We therefore investigated genome- based prediction of MICs in the pathogen Klebsiella pneumoniae using 4367 genomes with both simulated semi- quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black - box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal variants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predictive models can be improved when prior biological knowledge is taken into account, due to the varying genetic architecture of each antibiotic resistance trait. Finally, we emphasise that incrementing the population database is pivotal for the future clinical implementation of these models to support routine machinelearning based diagnostics.

引用

页数：15

共 50 条

[41] Exposure to Sub-inhibitory Concentrations of the Chemosensitizer 1-(1-Naphthylmethyl)-Piperazine Creates Membrane Destabilization in Multi-Drug Resistant Klebsiella pneumoniae
Anes, Joao
Sivasankaran, Sathesh K.
Muthappa, Dechamma M.
Fanning, Seamus
Srikumar, Shabarinath
FRONTIERS IN MICROBIOLOGY, 2019, 10
[42] Biofilm inhibitory effect of chlorhexidine conjugated gold nanoparticles against Klebsiella pneumoniae
Ahmed, Ayaz
Khan, Anum Khalid
Anwar, Ayaz
Ali, Syed Abid
Shah, Muhammad Raza
MICROBIAL PATHOGENESIS, 2016, 98 : 50 - 56
[43] Molib: A machine learning based classification tool for the prediction of biofilm inhibitory molecules
Srivastava, Gopal N.
Malwe, Aditya S.
Sharma, Ashok K.
Shastri, Vibhuti
Hibare, Keshav
Sharma, Vineet K.
GENOMICS, 2020, 112 (04) : 2823 - 2832
[44] Machine Learning Method for Prediction of Hearing Improvement After Stapedotomy
Rebol, Vid
Rebol, Janez
APPLIED SCIENCES-BASEL, 2024, 14 (24):
[45] Recurrent emergence of Klebsiella pneumoniae carbapenem resistance mediated by an inhibitory ompK36 mRNA secondary structure
Wong, Joshua L. C.
David, Sophia
Sanchez-Garrido, Julia
Woo, Jia Z.
Low, Wen Wen
Morecchiato, Fabio
Giani, Tommaso
Rossolini, Gian Maria
Beis, Konstantinos
Brett, Stephen J.
Clements, Abigail
Aanensen, David M.
Rouskin, Silvi
Frankel, Gad
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2022, 119 (38)
[46] Maize yield and nitrate loss prediction with machine learning algorithms
Shahhosseini, Mohsen
Martinez-Feria, Rafael A.
Hui, Guiping
Archontoulis, Sotirios V.
ENVIRONMENTAL RESEARCH LETTERS, 2019, 14 (12):
[47] Machine learning in protein structure prediction
AlQuraishi, Mohammed
CURRENT OPINION IN CHEMICAL BIOLOGY, 2021, 65 : 1 - 8
[48] Machine learning in the prediction of human wellbeing
Oparina, Ekaterina
Kaiser, Caspar
Gentile, Niccolo
Tkatchenko, Alexandre
Clark, Andrew E.
De Neve, Jan-Emmanuel
D'Ambrosio, Conchita
SCIENTIFIC REPORTS, 2025, 15 (01):
[49] DEA and Machine Learning for Performance Prediction
Zhang, Zhishuo
Xiao, Yao
Niu, Huayong
MATHEMATICS, 2022, 10 (10)
[50] Machine learning for the prediction of stopping powers
Parfitt, William A.
Jackman, Richard B.
NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION B-BEAM INTERACTIONS WITH MATERIALS AND ATOMS, 2020, 478 : 21 - 33

← 1 2 3 4 5 →