Contingency, repeatability, and predictability in the evolution of a prokaryotic pangenome

被引:19
作者
Beavan, Alan [1 ]
Sananes, Maria Rosa Domingo -
Mcinerney, James O. [1 ]
机构
[1] Univ Nottingham, Sch Life Sci, Nottingham NG7 2UH, England
基金
英国生物技术与生命科学研究理事会;
关键词
pangenomes; machine learning; evolution; ESCHERICHIA-COLI; GENE; SELECTION; SEQUENCE; TREE; KEGG;
D O I
10.1073/pnas.2304934120
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Pangenomes exhibit remarkable variability in many prokaryotic species, much of which is maintained through the processes of horizontal gene transfer and gene loss. Repeated acquisitions of near- identical homologs can easily be observed across pangenomes, leading to the question of whether these parallel events potentiate similar evolutionary trajectories, or whether the remarkably different genetic backgrounds of the recipients mean that postacquisition evolutionary trajectories end up being quite different. In this study, we present a machine learning method that predicts the presence or absence of genes in the Escherichia coli pangenome based on complex patterns of the presence or absence of other accessory genes within a genome. Our analysis leverages the repeated transfer of genes through the E. coli pangenome to observe patterns of repeated evolu-tion following similar events. We find that the presence or absence of a substantial set of genes is highly predictable from other genes alone, indicating that selection potentiates and maintains gene-gene co- occurrence and avoidance relationships deterministically over long-term bacterial evolution and is robust to differences in host evolutionary history. We propose that at least part of the pangenome can be understood as a set of genes with relationships that govern their likely cohabitants, analogous to an ecosys-tem's set of interacting organisms. Our findings indicate that intragenomic gene fitness effects may be key drivers of prokaryotic evolution, influencing the repeated emergence of complex gene-gene relationships across the pangenome.
引用
收藏
页数:10
相关论文
共 75 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   Biotechnological applications of penicillin acylases:: state-of-the-art [J].
Arroyo, M ;
de la Mata, I ;
Acebal, C ;
Castillón, MP .
APPLIED MICROBIOLOGY AND BIOTECHNOLOGY, 2003, 60 (05) :507-514
[4]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[5]  
Bastian M., 2009, Proceedings of the International Conference on Weblogs and Social Media (ICWSM), P361, DOI [10.1609/icwsm.v3i1.13937, DOI 10.1609/ICWSM.V3I1.13937]
[6]   Gene essentiality evolves across a pangenome [J].
Beavan, Alan J. S. ;
McInerney, James O. .
NATURE MICROBIOLOGY, 2022, 7 (10) :1510-1511
[7]   Fast unfolding of communities in large networks [J].
Blondel, Vincent D. ;
Guillaume, Jean-Loup ;
Lambiotte, Renaud ;
Lefebvre, Etienne .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
[8]   Contingency and determinism in evolution: Replaying life's tape [J].
Blount, Zachary D. ;
Lenski, Richard E. ;
Losos, Jonathan B. .
SCIENCE, 2018, 362 (6415)
[9]   Generalized hidden Markov models for phylogenetic comparative datasets [J].
Boyko, James D. ;
Beaulieu, Jeremy M. .
METHODS IN ECOLOGY AND EVOLUTION, 2021, 12 (03) :468-478
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32