Strong Heterogeneity in Mutation Rate Causes Misleading Hallmarks of Natural Selection on Indel Mutations in the Human Genome

被引:11
作者
Kvikstad, Erika M. [1 ]
Duret, Laurent [1 ]
机构
[1] Univ Lyon 1, CNRS, Lab Biometrie & Biol Evolut, UMR 5558, F-69622 Villeurbanne, France
关键词
indels; natural selection; homoplasy; sequence evolution; PROTEIN EVOLUTION; INTRON LENGTH; INSERTIONS; DNA; DELETIONS; RECOMBINATION; SEQUENCE; SIZE; CONTEXT; CONSTRAINTS;
D O I
10.1093/molbev/mst185
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Elucidating the mechanisms of mutation accumulation and fixation is critical to understand the nature of genetic variation and its contribution to genome evolution. Of particular interest is the effect of insertions and deletions (indels) on the evolution of genome landscapes. Recent population-scaled sequencing efforts provide unprecedented data for analyzing the relative impact of selection versus nonadaptive forces operating on indels. Here, we combined McDonald-Kreitman tests with the analysis of derived allele frequency spectra to investigate the dynamics of allele fixation of short (1-50 bp) indels in the human genome. Our analyses revealed apparently higher fixation probabilities for insertions than deletions. However, this fixation bias is not consistent with either selection or biased gene conversion and varies with local mutation rate, being particularly pronounced at indel hotspots. Furthermore, we identified an unprecedented number of loci with evidence for multiple indel events in the primate phylogeny. Even in nonrepetitive sequence contexts (a priori not prone to indel mutations), such loci are 60-fold more frequent than expected according to a model of uniform indel mutation rate. This provides evidence of as yet unidentified cryptic indel hotspots. We propose that indel homoplasy, at known and cryptic hotspots, produces systematic errors in determination of ancestral alleles via parsimony and advise caution interpreting classic selection tests given the strong heterogeneity in indel rates across the genome. These results will have great impact on studies seeking to infer evolutionary forces operating on indels observed in closely related species, because such mutations are traditionally presumed homoplasy-free.
引用
收藏
页码:23 / 36
页数:14
相关论文
共 76 条
[1]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[2]   A genome-wide view of mutation rate co-variation using multivariate analyses [J].
Ananda, Guruprasad ;
Chiaromonte, Francesca ;
Makova, Kateryna D. .
GENOME BIOLOGY, 2011, 12 (03)
[3]  
[Anonymous], P NATL ACAD SCI US
[4]  
[Anonymous], NATURE, DOI [DOI 10.1038/NATURE04072, 10.1038/nature04072]
[5]   Regional and time-resolved mutation patterns of the human genome [J].
Arndt, PF ;
Hwa, T .
BIOINFORMATICS, 2004, 20 (10) :1482-1485
[6]   Large-Scale Parsimony Analysis of Metazoan Indels in Protein-Coding Genes [J].
Belinky, Frida ;
Cohen, Ofir ;
Huchon, Dorothee .
MOLECULAR BIOLOGY AND EVOLUTION, 2010, 27 (02) :441-451
[7]   Comprehensive identification and characterization of diallelic insertion-deletion polymorphisms in 330 human candidate genes [J].
Bhangale, TR ;
Rieder, MJ ;
Livingston, RJ ;
Nickerson, DA .
HUMAN MOLECULAR GENETICS, 2005, 14 (01) :59-69
[8]   A framework for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly [J].
Blankenberg, Daniel ;
Taylor, James ;
Schenck, Ian ;
He, Jianbin ;
Zhang, Yi ;
Ghent, Matthew ;
Veeraraghavan, Narayanan ;
Albert, Istvan ;
Miller, Webb ;
Makova, Kateryna D. ;
Hardison, Ross C. ;
Nekrutenko, Anton .
GENOME RESEARCH, 2007, 17 (06) :960-964
[9]   Majority of divergence between closely related DNA samples is due to indels [J].
Britten, RJ ;
Rowen, L ;
Williams, J ;
Cameron, RA .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (08) :4661-4665
[10]   Fine-Scale Maps of Recombination Rates and Hotspots in the Mouse Genome [J].
Brunschwig, Hadassa ;
Levi, Liat ;
Ben-David, Eyal ;
Williams, Robert W. ;
Yakir, Benjamin ;
Shifman, Sagiv .
GENETICS, 2012, 191 (03) :757-U169