Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae

被引:1
作者
Biffignandi, Gherard Batisti [1 ,2 ,3 ]
Chindelevitch, Leonid [2 ]
Corbella, Marta [4 ]
Feil, Edward J. [5 ]
Sassera, Davide [1 ,6 ]
Lees, John A. [3 ]
机构
[1] Univ Pavia, Dept Biol & Biotechnol, Pavia, Italy
[2] Imperial Coll London, MRC Ctr Global Infect Dis Anal, London, England
[3] European Bioinformat Inst, European Mol Biol Lab, Wellcome Genome Campus, Hinxton, England
[4] Fdn IRCCS Policlin San Matteo, Microbiol & Virol Unit, Pavia, Italy
[5] Univ Bath, Milner Ctr Evolut, Dept Life Sci, Bath, England
[6] Fdn IRCCS Policlin San Matteo, Pavia, Italy
来源
MICROBIAL GENOMICS | 2024年 / 10卷 / 03期
基金
英国医学研究理事会;
关键词
AMR; antibiotic resistance; bacterial genomics; GWAS; Klebsiella pneumoniae; machine learning; MIC; REGULARIZATION; SELECTION; MODELS;
D O I
10.1099/mgen.0.001222
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab- based MIC determination can be time- consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow in silico MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi- quantitatively, with varying resolution, and are typically also left- and right- censored within varying ranges. We therefore investigated genome- based prediction of MICs in the pathogen Klebsiella pneumoniae using 4367 genomes with both simulated semi- quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black - box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal variants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predictive models can be improved when prior biological knowledge is taken into account, due to the varying genetic architecture of each antibiotic resistance trait. Finally, we emphasise that incrementing the population database is pivotal for the future clinical implementation of these models to support routine machinelearning based diagnostics.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Exposure to Sub-inhibitory Concentrations of the Chemosensitizer 1-(1-Naphthylmethyl)-Piperazine Creates Membrane Destabilization in Multi-Drug Resistant Klebsiella pneumoniae
    Anes, Joao
    Sivasankaran, Sathesh K.
    Muthappa, Dechamma M.
    Fanning, Seamus
    Srikumar, Shabarinath
    FRONTIERS IN MICROBIOLOGY, 2019, 10
  • [42] Biofilm inhibitory effect of chlorhexidine conjugated gold nanoparticles against Klebsiella pneumoniae
    Ahmed, Ayaz
    Khan, Anum Khalid
    Anwar, Ayaz
    Ali, Syed Abid
    Shah, Muhammad Raza
    MICROBIAL PATHOGENESIS, 2016, 98 : 50 - 56
  • [43] Molib: A machine learning based classification tool for the prediction of biofilm inhibitory molecules
    Srivastava, Gopal N.
    Malwe, Aditya S.
    Sharma, Ashok K.
    Shastri, Vibhuti
    Hibare, Keshav
    Sharma, Vineet K.
    GENOMICS, 2020, 112 (04) : 2823 - 2832
  • [44] Machine Learning Method for Prediction of Hearing Improvement After Stapedotomy
    Rebol, Vid
    Rebol, Janez
    APPLIED SCIENCES-BASEL, 2024, 14 (24):
  • [45] Recurrent emergence of Klebsiella pneumoniae carbapenem resistance mediated by an inhibitory ompK36 mRNA secondary structure
    Wong, Joshua L. C.
    David, Sophia
    Sanchez-Garrido, Julia
    Woo, Jia Z.
    Low, Wen Wen
    Morecchiato, Fabio
    Giani, Tommaso
    Rossolini, Gian Maria
    Beis, Konstantinos
    Brett, Stephen J.
    Clements, Abigail
    Aanensen, David M.
    Rouskin, Silvi
    Frankel, Gad
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2022, 119 (38)
  • [46] Maize yield and nitrate loss prediction with machine learning algorithms
    Shahhosseini, Mohsen
    Martinez-Feria, Rafael A.
    Hui, Guiping
    Archontoulis, Sotirios V.
    ENVIRONMENTAL RESEARCH LETTERS, 2019, 14 (12):
  • [47] Machine learning in protein structure prediction
    AlQuraishi, Mohammed
    CURRENT OPINION IN CHEMICAL BIOLOGY, 2021, 65 : 1 - 8
  • [48] Machine learning in the prediction of human wellbeing
    Oparina, Ekaterina
    Kaiser, Caspar
    Gentile, Niccolo
    Tkatchenko, Alexandre
    Clark, Andrew E.
    De Neve, Jan-Emmanuel
    D'Ambrosio, Conchita
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [49] DEA and Machine Learning for Performance Prediction
    Zhang, Zhishuo
    Xiao, Yao
    Niu, Huayong
    MATHEMATICS, 2022, 10 (10)
  • [50] Machine learning for the prediction of stopping powers
    Parfitt, William A.
    Jackman, Richard B.
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION B-BEAM INTERACTIONS WITH MATERIALS AND ATOMS, 2020, 478 : 21 - 33