MACI: A machine learning-based approach to identify drug classes of antibiotic resistance genes from metagenomic data

被引:2
作者
Chowdhury, Rohit Roy [1 ]
Dhar, Jesmita [2 ]
Robinson, Stephy Mol [2 ]
Lahiri, Abhishake [2 ,3 ]
Basak, Kausik [2 ]
Paul, Sandip [2 ]
Banerjee, Rachana [2 ]
机构
[1] JIS Univ, JIS Inst Adv Studies & Res Kolkata, Ctr Data Sci, Kolkata, WB, India
[2] JIS Univ, JIS Inst Adv Studies & Res, Ctr Hlth Sci & Technol, Kolkata, WB, India
[3] CSIR Indian Inst Chem Biol, Div Struct Biol & Bioinformat, Kolkata, WB, India
关键词
Antibiotic resistance gene; Drug class; Machine learning; Gene sequencing; Taxonomic clades; Metagenomic reads; ANTIMICROBIAL RESISTANCE; BETA-LACTAMASE; CD-HIT; MECHANISMS; RESISTOME; BACTERIA; PROTEIN;
D O I
10.1016/j.compbiomed.2023.107629
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Novel methodologies are now essential for identification of antibiotic resistant pathogens in order to resist them. Here, we are presenting a model, MACI (Machine learning-based Antibiotic resistance gene-specific drug Class Identification) that can take metagenomic fragments as input and predict the drug class of antibiotic resistant genes. In our study, we trained a model using the Comprehensive Antibiotic Resistance Database, containing 5138 representative sequences across 134 drug classes. Among these classes, 23 dominated, contributing 85% of the sequence data. The model achieved an average precision of 0.8389 +/- 0.0747 and recall of 0.8197 +/- 0.0782 for these 23 drug classes. Additionally, it exhibited higher performance (precision and recall: 0.8817 +/- 0.0540 and 0.8620 +/- 0.0493) for predicting multidrug resistant classes compared to single drug resistant categories (0.7923 +/- 0.0669 and 0.7737 +/- 0.0794). The model also showed promising results when tested on an independent data. We then analysed these 23 drug classes to identify class-specific overlapping nucleotide patterns. Five significant drug classes, viz. "Carbapenem; cephalosporin; penam", "cephalosporin", "cephamycin", "cephalosporin; monobactam; penam; penem", and "fluoroquinolone" were identified, and their patterns aligned with the functional domains of antibiotic resistance genes. These class-specific patterns play a pivotal role in rapidly identifying drug classes with antibiotic resistance genes. Further analysis revealed that bacterial species containing these five drug classes are associated with well-known multidrug resistance properties.
引用
收藏
页数:15
相关论文
共 68 条
[31]   The PROSITE database [J].
Hulo, Nicolas ;
Bairoch, Amos ;
Bulliard, Virginie ;
Cerutti, Lorenzo ;
De Castro, Edouard ;
Langendijk-Genevaux, Petra S. ;
Pagni, Marco ;
Sigrist, Christian J. A. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D227-D230
[32]   CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database [J].
Jia, Baofeng ;
Raphenya, Amogelang R. ;
Alcock, Brian ;
Waglechner, Nicholas ;
Guo, Peiyao ;
Tsang, Kara K. ;
Lago, Briony A. ;
Dave, Biren M. ;
Pereira, Sheldon ;
Sharma, Arjun N. ;
Doshi, Sachin ;
Courtot, Melanie ;
Lo, Raymond ;
Williams, Laura E. ;
Frye, Jonathan G. ;
Elsayegh, Tariq ;
Sardar, Daim ;
Westman, Erin L. ;
Pawlowski, Andrew C. ;
Johnson, Timothy A. ;
Brinkman, Fiona S. L. ;
Wright, Gerard D. ;
McArthur, Andrew G. .
NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) :D566-D573
[33]  
Kingma DP, 2014, ADV NEUR IN, V27
[34]  
Kleinheinz Kortine Annina, 2014, Bacteriophage, V4, pe27943
[35]   Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J].
Langmead, Ben ;
Trapnell, Cole ;
Pop, Mihai ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2009, 10 (03)
[36]   The Sequence Read Archive [J].
Leinonen, Rasko ;
Sugawara, Hideaki ;
Shumway, Martin .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D19-D21
[37]   Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences [J].
Li, Weizhong ;
Godzik, Adam .
BIOINFORMATICS, 2006, 22 (13) :1658-1659
[38]   Efflux-Mediated Drug Resistance in Bacteria An Update [J].
Li, Xian-Zhi ;
Nikaido, Hiroshi .
DRUGS, 2009, 69 (12) :1555-1623
[39]   HMD-ARG: hierarchical multi-task deep learning for annotating antibiotic resistance genes [J].
Li, Yu ;
Xu, Zeling ;
Han, Wenkai ;
Cao, Huiluo ;
Umarov, Ramzan ;
Yan, Aixin ;
Fan, Ming ;
Chen, Huan ;
Duarte, Carlos M. ;
Li, Lihua ;
Ho, Pak-Leung ;
Gao, Xin .
MICROBIOME, 2021, 9 (01)
[40]   ARDB-Antibiotic Resistance Genes Database [J].
Liu, Bo ;
Pop, Mihai .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D443-D447