Forecasting Staphylococcus aureus Infections Using Genome-Wide Association Studies, Machine Learning, and Transcriptomic Approaches

被引:5
作者
Sassi, Mohamed [1 ]
Bronsard, Julie [1 ]
Pascreau, Gaetan [1 ]
Emily, Mathieu [2 ]
Donnio, Pierre-Yves [1 ,3 ]
Revest, Matthieu [1 ,4 ]
Felden, Brice [1 ]
Wirth, Thierry [5 ,6 ]
Augagneur, Yoann [1 ]
机构
[1] INSERM, BRM Bacterial Regulatory RNAs & Med UMR S 1230, Rennes, France
[2] Univ Rennes, CNRS, Inst Agro, IRMAR Inst Rech Math Rennes UMR 6625, Rennes, France
[3] CHU Rennes, Serv Bacteriol Hyg Hosp, Rennes, France
[4] CHU Rennes, Serv Malad Infect & Reanimat Med, Rennes, France
[5] Univ Paris 06, Univ Antilles, Ecole Prat Hautes Etud, Museum Natl Hist Nat,UMR CNRS 7205,Inst Systemat, Paris, France
[6] PSL Univ, Ecole Prat Hautes Etud EPHE, Paris, France
关键词
Staphylococcus aureus; genomics; GWAS; coding and noncoding regions; random forest; trancriptomics; genetic markers; bacteremia; nasal colonization; METHICILLIN-RESISTANT; REGULATORY RNA; NASAL CARRIAGE; VIRULENCE; BACTEREMIA; EPIDEMIOLOGY; MECHANISMS; EXPRESSION; PATHOPHYSIOLOGY; CONTRIBUTES;
D O I
10.1128/msystems.00378-22
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Staphylococcus aureus is a major human and animal pathogen, colonizing diverse ecological niches within its hosts. Predicting whether an isolate will infect a specific host and its subsequent clinical fate remains unknown. In this study, we investigated the S. aureus pangenome using a curated set of 356 strains, spanning a wide range of hosts, origins, and clinical display and antibiotic resistance profiles. We used genome-wide association study (GWAS) and random forest (RF) algorithms to discriminate strains based on their origins and clinical sources. Here, we show that the presence of sak and scn can discriminate strains based on their host specificity, while other genes such as mecA are often associated with virulent outcomes. Both GWAS and RF indicated the importance of intergenic regions (IGRs) and coding DNA sequence (CDS) but not sRNAs in forecasting an outcome. Additional transcriptomic analyses performed on the most prevalent clonal complex 8 (CC8) clonal types, in media mimicking nasal colonization or bacteremia, indicated three RNAs as potential RNA markers to forecast infection, followed by 30 others that could serve as infection severity predictors. Our report shows that genetic association and transcriptomics are complementary approaches that will be combined in a single analytical framework to improve our understanding of bacterial pathogenesis and ultimately identify potential predictive molecular markers. IMPORTANCE Predicting the outcome of bacterial colonization and infections, based on extensive genomic and transcriptomic data from a given pathogen, would be of substantial help for clinicians in treating and curing patients. In this report, genome-wide association studies and random forest algorithms have defined gene combinations that differentiate human from animal strains, colonization from diseases, and nonsevere from severe diseases, while it revealed the importance of IGRs and CDS, but not small RNAs (sRNAs), in anticipating an outcome. In addition, transcriptomic analyses performed on the most prevalent clonal types, in media mimicking either nasal colonization or bacteremia, revealed significant differences and therefore potent RNA markers. Overall, the use of both genomic and transcriptomic data in a single analytical framework can enhance our understanding of bacterial pathogenesis.
引用
收藏
页数:21
相关论文
共 86 条
  • [1] A random forest based biomarker discovery and power analysis framework for diagnostics research
    Acharjee, Animesh
    Larkman, Joseph
    Xu, Yuanwei
    Cardoso, Victor Roth
    Gkoutos, Georgios V.
    [J]. BMC MEDICAL GENOMICS, 2020, 13 (01)
  • [2] Differential expression analysis for sequence count data
    Anders, Simon
    Huber, Wolfgang
    [J]. GENOME BIOLOGY, 2010, 11 (10):
  • [3] HTSeq-a Python']Python framework to work with high-throughput sequencing data
    Anders, Simon
    Pyl, Paul Theodor
    Huber, Wolfgang
    [J]. BIOINFORMATICS, 2015, 31 (02) : 166 - 169
  • [4] S. aureus IgG-binding proteins SpA and Sbi:: Host speciticity and mechanisms of immune complex formation
    Atkins, Karen L.
    Burman, Julia D.
    Chamberlain, Emily S.
    Cooper, Jessica E.
    Poutrel, Bernard
    Bagby, Stefan
    Jenkins, A. Toby A.
    Feil, Edward J.
    van den Elsen, Jean M. H.
    [J]. MOLECULAR IMMUNOLOGY, 2008, 45 (06) : 1600 - 1611
  • [5] The Staphylococcus aureus CsoR regulates both chromosomal and plasmid-encoded copper resistance mechanisms
    Baker, Jonathan
    Sengupta, Mrittika
    Jayaswal, Radheshyan K.
    Morrissey, Julie A.
    [J]. ENVIRONMENTAL MICROBIOLOGY, 2011, 13 (09) : 2495 - 2507
  • [6] SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
    Bankevich, Anton
    Nurk, Sergey
    Antipov, Dmitry
    Gurevich, Alexey A.
    Dvorkin, Mikhail
    Kulikov, Alexander S.
    Lesin, Valery M.
    Nikolenko, Sergey I.
    Son Pham
    Prjibelski, Andrey D.
    Pyshkin, Alexey V.
    Sirotkin, Alexander V.
    Vyahhi, Nikolay
    Tesler, Glenn
    Alekseyev, Max A.
    Pevzner, Pavel A.
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) : 455 - 477
  • [7] Is the Colonisation of Staphylococcus aureus in Pets Associated with Their Close Contact with Owners?
    Bierowiec, Karolina
    Ploneczka-Janeczko, Katarzyna
    Rypula, Krzysztof
    [J]. PLOS ONE, 2016, 11 (05):
  • [8] Toward almost closed genomes with GapFiller
    Boetzer, Marten
    Pirovano, Walter
    [J]. GENOME BIOLOGY, 2012, 13 (06):
  • [9] Trimmomatic: a flexible trimmer for Illumina sequence data
    Bolger, Anthony M.
    Lohse, Marc
    Usadel, Bjoern
    [J]. BIOINFORMATICS, 2014, 30 (15) : 2114 - 2120
  • [10] Comparative genome-scale modelling of Staphylococcus aureus strains identifies strain-specific metabolic capabilities linked to pathogenicity
    Bosi, Emanuele
    Monk, Jonathan M.
    Aziz, Ramy K.
    Fondi, Marco
    Nizet, Victor
    Palsson, Bernhard O.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (26) : E3801 - E3809