From De Novo to "De Nono": The Majority of Novel Protein-Coding Genes Identified with Phylostratigraphy Are Old Genes or Recent Duplicates

被引:37
作者
Casola, Claudio [1 ]
机构
[1] Texas A&M Univ, Dept Ecosyst Sci & Management, College Stn, TX 77843 USA
来源
GENOME BIOLOGY AND EVOLUTION | 2018年 / 10卷 / 11期
基金
美国食品与农业研究所;
关键词
de novo genes; synteny; gene age; GENOME; AGGREGATION; EVOLUTION; ALIGNMENTS; ANNOTATION; SEQUENCE; INSIGHTS; ORIGIN; MOUSE;
D O I
10.1093/gbe/evy231
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The evolution of novel protein-coding genes from noncoding regions of the genome is one of the most compellingpieces of evidence for genetic innovations in nature. One popular approach to identify de novo genes is phylostratigraphy, which consists of determining the approximate time of origin (age) of a gene based on its distribution along a species phylogeny. Several studies have revealed significant flaws in determining the age of genes, including de novo genes, using phylostratigraphy alone. However, the rate of false positives in de novo gene surveys, based on phylostratigraphy, remains unknown. Here, I reanalyze the findings from three studies, two of which identified tens to hundreds of rodent-specific de novo genes adopting a phylostratigraphy-centered approach. Most putative de novo genes discovered in these investigations are no longer included in recently updated mouse gene sets. Using a combination of synteny information and sequence similarity searches, I show that similar to 60% of the remaining 381 putative de novo genes share homology with genes from other vertebrates, originated through gene duplication, and/or share no synteny information with nonrodent mammals. These results led to an estimated rate of similar to 12 de novo genes per million years in mouse. Contrary to a previous study (Wilson BA, Foy SG, Neme R, Masel J. 2017. Young genes are highly disordered as predicted by the preadaptation hypothesis of de novo gene birth. Nat Ecol Evol. 1:0146), I found no evidence supporting the preadaptation hypothesis of de novo gene formation. Nearly half of the de novo genes confirmed in this study are within older genes, indicating that co-option of preexisting regulatory regions and a higher GC content may facilitate the origin of novel genes.
引用
收藏
页码:2906 / 2918
页数:13
相关论文
共 52 条
  • [1] High GC content causes orphan proteins to be intrinsically disordered
    Basile, Walter
    Sachenkova, Oxana
    Light, Sara
    Elofsson, Arne
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2017, 13 (03)
  • [2] Evidence for de novo evolution of testis-expressed genes in the Drosophila yakuba Drosophila erecta clade
    Begun, David J.
    Lindfors, Heather A.
    Kern, Andrew D.
    Jones, Corbin D.
    [J]. GENETICS, 2007, 176 (02) : 1131 - 1137
  • [3] Mouse Genome Database (MGD)-2017: community knowledge resource for the laboratory mouse
    Blake, Judith A.
    Eppig, Janan T.
    Kadin, James A.
    Richardson, Joel E.
    Smith, Cynthia L.
    Bult, Carol J.
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) : D723 - D729
  • [4] Making whole genome multiple alignments usable for biologists
    Blankenberg, Daniel
    Taylor, James
    Nekrutenko, Anton
    [J]. BIOINFORMATICS, 2011, 27 (17) : 2426 - 2428
  • [5] BLAST plus : architecture and applications
    Camacho, Christiam
    Coulouris, George
    Avagyan, Vahram
    Ma, Ning
    Papadopoulos, Jason
    Bealer, Kevin
    Madden, Thomas L.
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [6] Proto-genes and de novo gene birth
    Carvunis, Anne-Ruxandra
    Rolland, Thomas
    Wapinski, Ilan
    Calderwood, Michael A.
    Yildirim, Muhammed A.
    Simonis, Nicolas
    Charloteaux, Benoit
    Hidalgo, Cesar A.
    Barbette, Justin
    Santhanam, Balaji
    Brar, Gloria A.
    Weissman, Jonathan S.
    Regev, Aviv
    Thierry-Mieg, Nicolas
    Cusick, Michael E.
    Vidal, Marc
    [J]. NATURE, 2012, 487 (7407) : 370 - 374
  • [7] Emergence, Retention and Selection: A Trilogy of Origination for Functional De Novo Proteins from Ancestral LncRNAs in Primates
    Chen, Jia-Yu
    Shen, Qing Sunny
    Zhou, Wei-Zhen
    Peng, Jiguang
    He, Bin Z.
    Li, Yumei
    Liu, Chu-Jun
    Luan, Xuke
    Ding, Wanqiu
    Li, Shuxian
    Chen, Chunyan
    Tan, Bertrand Chin-Ming
    Zhang, Yong E.
    He, Aibin
    Li, Chuan-Yun
    [J]. PLOS GENETICS, 2015, 11 (07):
  • [8] A first look at ARFome: Dual-coding genes in mammalian Genomes
    Chung, Wen-Yu
    Wadhawan, Samir
    Szklarczyk, Radek
    Pond, Sergei Kosakovsky
    Nekrutenko, Anton
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (05) : 855 - 861
  • [9] A phylostratigraphy approach to uncover the genomic history of major adaptations in metazoan lineages
    Domazet-Loso, Tomislav
    Brajkovic, Josip
    Tautz, Diethard
    [J]. TRENDS IN GENETICS, 2007, 23 (11) : 533 - 539
  • [10] The "Inverse relationship between evolutionary rate and age of mammalian genes" is an artifact of increased genetic distance with rate of evolution and time of divergence
    Elhaik, E
    Sabath, N
    Graur, D
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2006, 23 (01) : 1 - 3