Multi-locus match probability in a finite population: a fundamental difference between the Moran and Wright-Fisher models

被引:14
作者
Bhaskar, Anand [1 ]
Song, Yun S. [1 ,2 ]
机构
[1] Univ Calif Berkeley, Div Comp Sci, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
关键词
D O I
10.1093/bioinformatics/btp227
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A fundamental problem in population genetics, which being also of importance to forensic science, is to compute the match probability (MP) that two individuals randomly chosen from a population have identical alleles at a collection of loci. At present, 11-13 unlinked autosomal microsatellite loci are typed for forensic use. In a finite population, the genealogical relationships of individuals can create statistical non-independence of alleles at unlinked loci. However, the so-called product rule, which is used in courts in the USA, computes the MP for multiple unlinked loci by assuming statistical independence, multiplying the one-locus MPs at those loci. Analytically testing the accuracy of the product rule for more than five loci has hitherto remained an open problem. Results: In this article, we adopt a flexible graphical framework to compute multi-locus MPs analytically. We consider two standard models of random mating, namely the Wright-Fisher (WF) and Moran models. We succeed in computing haplotypic MPs for up to 10 loci in the WF model, and up to 13 loci in the Moran model. For a finite population and a large number of loci, we show that the MPs predicted by the product rule are highly sensitive to mutation rates in the range of interest, while the true MPs computed using our graphical framework are not. Furthermore, we show that the WF and Moran models may produce drastically different MPs for a finite population, and that this difference grows with the number of loci and mutation rates. Although the two models converge to the same coalescent or diffusion limit, in which the population size approaches infinity, we demonstrate that, when multiple loci are considered, the rate of convergence in the Moran model is significantly slower than that in the WF model.
引用
收藏
页码:I187 / I195
页数:9
相关论文
共 14 条
  • [1] [Anonymous], 1973, GRAPHICAL ENUMERATIO
  • [2] Balding DJ, 2005, STAT PRACT
  • [3] Budowle B, 2001, J FORENSIC SCI, V46, P453
  • [4] Evett I., 1998, Interpreting DNA evidence: Statistical genetics for forensic scientists
  • [5] Ewens W.J., 2004, Mathematical Population Genetics 1: Theoretical Introduction
  • [6] EWENS WJ, 1972, THEOR POPUL BIOL, V3, P87, DOI 10.1016/0040-5809(72)90035-4
  • [7] Harding RM, 1997, AM J HUM GENET, V60, P772
  • [8] Genetic traces of ancient demography
    Harpending, HC
    Batzer, MA
    Gurven, M
    Jorde, LB
    Rogers, AR
    Sherry, ST
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (04) : 1961 - 1967
  • [9] Counting enriched multigraphs according to the number of their edges (or arcs)
    Labelle, G
    [J]. DISCRETE MATHEMATICS, 2000, 217 (1-3) : 237 - 248
  • [10] Dependency effects in multi-locus match probabilities
    Laurie, C
    Weir, BS
    [J]. THEORETICAL POPULATION BIOLOGY, 2003, 63 (03) : 207 - 219