Importance Sampling for the Infinite Sites Model

被引:0
作者
Hobolth, Asger [1 ]
Uyenoyama, Marcy K. [2 ]
Wiuf, Carsten [1 ]
机构
[1] Aarhus Univ, DK-8000 Aarhus C, Denmark
[2] Duke Univ, Durham, NC 27706 USA
关键词
ancestral inference; coalescent; importance sampling; infinite sites;
D O I
暂无
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Importance sampling or Markov Chain Monte Carlo sampling is required for state-of-the-art statistical analysis of population genetics data. The applicability of these sampling-based inference techniques depends crucially on the proposal distribution. In this paper, we discuss importance sampling for the infinite sites model. The infinite sites assumption is attractive because it constraints the number of possible genealogies, thereby allowing for the analysis of larger data sets. We recall the Griffiths-Tavare and Stephens-Donnelly proposals and emphasize the relation between the latter proposal and exact sampling from the infinite alleles model. We also introduce a new proposal that takes knowledge of the ancestral state into account. The new proposal is derived from a new result on exact sampling from a single site. The methods are illustrated on simulated data sets and the data considered in Griffiths and Tavare (1994).
引用
收藏
页数:26
相关论文
共 21 条
[1]  
De Iorio M, 2004, ADV APPL PROBAB, V36, P417, DOI 10.1239/aap/1086957579
[2]   PARTITION STRUCTURES, POLYA URNS, THE EWENS SAMPLING FORMULA, AND THE AGES OF ALLELES [J].
DONNELLY, P .
THEORETICAL POPULATION BIOLOGY, 1986, 30 (02) :271-288
[3]  
ETHIER SN, 1987, ANN PROBAB, V15, P414
[4]  
EWENS WJ, 1972, THEOR POPUL BIOL, V3, P87, DOI 10.1016/0040-5809(72)90035-4
[5]  
Felsenstein J., 1999, Likelihoods on coalescents: a Monte Carlo sampling approach to inferring parameters from population samples of molecular data, V33, P163, DOI DOI 10.1214/LNMS/1215455552
[6]   STATISTICAL PROPERTIES OF SEGREGATING SITES [J].
FU, YX .
THEORETICAL POPULATION BIOLOGY, 1995, 48 (02) :172-197
[7]   Local adaptive importance sampling for multivariate densities with strong nonlinear relationships [J].
Givens, GH ;
Raftery, AE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1996, 91 (433) :132-141
[8]   ANCESTRAL INFERENCE IN POPULATION-GENETICS [J].
GRIFFITHS, RC ;
TAVARE, S .
STATISTICAL SCIENCE, 1994, 9 (03) :307-319
[9]   Ewens sampling formula and related formulae: combinatorial proofs, extensions to variable population size and applications to ages of alleles [J].
Griffiths, RC ;
Lessard, S .
THEORETICAL POPULATION BIOLOGY, 2005, 68 (03) :167-177
[10]   EFFICIENT ALGORITHMS FOR INFERRING EVOLUTIONARY TREES [J].
GUSFIELD, D .
NETWORKS, 1991, 21 (01) :19-28