Learning Retention Mechanisms and Evolutionary Parameters of Duplicate Genes from Their Expression Data

被引:6
作者
DeGiorgio, Michael [1 ,2 ]
Assis, Raquel [1 ,2 ]
机构
[1] Florida Atlantic Univ, Dept Comp & Elect Engn & Comp Sci, Boca Raton, FL 33431 USA
[2] Florida Atlantic Univ, Inst Human Hlth & Dis Intervent, Boca Raton, FL 33431 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
gene duplication; neofunctionalization; subfunctionalization; Ornstein-Uhlenbeck; neural network; GENOME DUPLICATION; POSITIVE SELECTION; TRANSITION-STATE; SMALL-SCALE; DROSOPHILA; RATES; NEOFUNCTIONALIZATION; DIVERGENCE; PROTEINS; SUBFUNCTIONALIZATION;
D O I
10.1093/molbev/msaa267
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Learning about the roles that duplicate genes play in the origins of novel phenotypes requires an understanding of how their functions evolve. A previous method for achieving this goal, CDROM, employs gene expression distances as proxies for functional divergence and then classifies the evolutionary mechanisms retaining duplicate genes from comparisons of these distances in a decision tree framework. However, CDROM does not account for stochastic shifts in gene expression or leverage advances in contemporary statistical learning for performing classification, nor is it capable of predicting the parameters driving duplicate gene evolution. Thus, here we develop CLOUD, a multi-layer neural network built on a model of gene expression evolution that can both classify duplicate gene retention mechanisms and predict their underlying evolutionary parameters. We show that not only is the CLOUD classifier substantially more powerful and accurate than CDROM, but that it also yields accurate parameter predictions, enabling a better understanding of the specific forces driving the evolution and long-term retention of duplicate genes. Further, application of the CLOUD classifier and predictor to empirical data from Drosophila recapitulates many previous findings about gene duplication in this lineage, showing that new functions often emerge rapidly and asymmetrically in younger duplicate gene copies, and that functional divergence is driven by strong natural selection. Hence, CLOUD represents a major advancement in classifying retention mechanisms and predicting evolutionary parameters of duplicate genes, thereby highlighting the utility of incorporating sophisticated statistical learning techniques to address long-standing questions about evolution after gene duplication.
引用
收藏
页码:1209 / 1224
页数:16
相关论文
共 105 条
  • [1] Abadi Martin, 2016, ARXIV160304467
  • [2] Predicting the Landscape of Recombination Using Deep Learning
    Adrion, Jeffrey R.
    Galloway, Jared G.
    Kern, Andrew D.
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2020, 37 (06) : 1790 - 1808
  • [3] Allaire, 2017, R INTERFACE TO KERAS
  • [4] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [5] Out of the testis, into the ovary: biased outcomes of gene duplication and deletion in Drosophila
    Assis, Raquel
    [J]. EVOLUTION, 2019, 73 (09) : 1850 - 1862
  • [6] Lineage-Specific Expression Divergence in Grasses Is Associated with Male Reproduction, Host-Pathogen Defense, and Domestication
    Assis, Raquel
    [J]. GENOME BIOLOGY AND EVOLUTION, 2019, 11 (01): : 207 - 219
  • [7] Rapid divergence and diversification of mammalian duplicate gene functions
    Assis, Raquel
    Bachtrog, Doris
    [J]. BMC EVOLUTIONARY BIOLOGY, 2015, 15
  • [8] Drosophila duplicate genes evolve new functions on the fly
    Assis, Raquel
    [J]. FLY, 2014, 8 (02) : 91 - 94
  • [9] Conserved Proteins Are Fragile
    Assis, Raquel
    Kondrashov, Alexey S.
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2014, 31 (02) : 419 - 424
  • [10] Neofunctionalization of young duplicate genes in Drosophila
    Assis, Raquel
    Bachtrog, Doris
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (43) : 17409 - 17414