Generalizability of machine learning in predicting antimicrobial resistance in E. coli: a multi-country case study in Africa

被引:11
|
作者
Nsubuga, Mike [1 ,3 ,5 ,6 ]
Galiwango, Ronald [1 ,3 ]
Jjingo, Daudi [2 ,3 ]
Mboowa, Gerald [1 ,3 ,4 ]
机构
[1] Makerere Univ, Coll Hlth Sci, Sch Biomed Sci, Dept Immunol & Mol Biol, POB 7072, Kampala, Uganda
[2] Makerere Univ, Coll Comp & Informat Sci, Dept Comp Sci, POB 7062, Kampala, Uganda
[3] Makerere Univ, Infect Dis Inst, Coll Hlth Sci, African Ctr Excellence Bioinformat & Data Intens S, POB 22418, Kampala, Uganda
[4] African Union Commiss, Africa Ctr Dis Control & Prevent, POB 3243,Roosevelt St, Addis Ababa W21 K19, Ethiopia
[5] Univ Bristol, Fac Hlth Sci, Bristol BS40 5DU, England
[6] Univ Bristol, Jean Golding Inst, Bristol BS8 1UH, England
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Antimicrobial resistance; E; coli; Machine learning; Africa; Whole-genome sequencing; ESCHERICHIA-COLI;
D O I
10.1186/s12864-024-10214-4
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundAntimicrobial resistance (AMR) remains a significant global health threat particularly impacting low- and middle-income countries (LMICs). These regions often grapple with limited healthcare resources and access to advanced diagnostic tools. Consequently, there is a pressing need for innovative approaches that can enhance AMR surveillance and management. Machine learning (ML) though underutilized in these settings, presents a promising avenue. This study leverages ML models trained on whole-genome sequencing data from England, where such data is more readily available, to predict AMR in E. coli, targeting key antibiotics such as ciprofloxacin, ampicillin, and cefotaxime. A crucial part of our work involved the validation of these models using an independent dataset from Africa, specifically from Uganda, Nigeria, and Tanzania, to ascertain their applicability and effectiveness in LMICs. ResultsModel performance varied across antibiotics. The Support Vector Machine excelled in predicting ciprofloxacin resistance (87% accuracy, F1 Score: 0.57), Light Gradient Boosting Machine for cefotaxime (92% accuracy, F1 Score: 0.42), and Gradient Boosting for ampicillin (58% accuracy, F1 Score: 0.66). In validation with data from Africa, Logistic Regression showed high accuracy for ampicillin (94%, F1 Score: 0.97), while Random Forest and Light Gradient Boosting Machine were effective for ciprofloxacin (50% accuracy, F1 Score: 0.56) and cefotaxime (45% accuracy, F1 Score:0.54), respectively. Key mutations associated with AMR were identified for these antibiotics. ConclusionAs the threat of AMR continues to rise, the successful application of these models, particularly on genomic datasets from LMICs, signals a promising avenue for improving AMR prediction to support large AMR surveillance programs. This work thus not only expands our current understanding of the genetic underpinnings of AMR but also provides a robust methodological framework that can guide future research and applications in the fight against AMR.
引用
收藏
页数:13
相关论文
共 24 条
  • [21] Evaluating machine learning models in predicting dam inflow and hydroelectric power production in multi-purpose dams (case study: Mahabad Dam, Iran)
    Enayati, Seyed Mohammad
    Najarchi, Mohsen
    Mohammadpour, Osman
    Mirhosseini, Seyed Mohammad
    APPLIED WATER SCIENCE, 2024, 14 (09)
  • [22] Predicting Manganese Mineralization Using Multi-Source Remote Sensing and Machine Learning: A Case Study from the Malkansu Manganese Belt, Western Kunlun
    Zhao, Jiahua
    He, Li
    Gong, Jiansheng
    He, Zhengwei
    Feng, Ziwen
    Pang, Jintai
    Zeng, Wanting
    Yan, Yujun
    Yuan, Yan
    MINERALS, 2025, 15 (02)
  • [23] Machine learning algorithms for mapping Prosopis glandulosa and land cover change using multi-temporal Landsat products: a case study of Prieska in the Northern Cape Province, South Africa
    de Villiers, Colette
    Munghemezulu, Cilence
    Chirima, George
    Tsele, Philemon
    Mashaba-Munghemezulu, Zinhle
    SOUTH AFRICAN JOURNAL OF GEOMATICS, 2020, 9 (02): : 179 - 197
  • [24] Application of Nighttime Light Data Simulation Based on Multi-Indicator System and Machine Learning Model in Predicting Potentially Suitable Economic Development Areas: A Case Study of the Turpan-Hami Region
    Zhang, Guangpeng
    Zhang, Li
    Chen, Yiyang
    Chen, Meng
    Tian, Jingjing
    Wu, Yin
    REMOTE SENSING, 2025, 17 (02)