Automatic inference of demographic parameters using generative adversarial networks

被引:36
作者
Wang, Zhanpeng [1 ]
Wang, Jiaping [1 ]
Kourakos, Michael [2 ]
Nhung Hoang [2 ]
Lee, Hyong Hark [2 ]
Mathieson, Iain [3 ]
Mathieson, Sara [1 ]
机构
[1] Haverford Coll, Dept Comp Sci, Haverford, PA 19041 USA
[2] Swarthmore Coll, Dept Comp Sci, Swarthmore, PA 19081 USA
[3] Univ Penn, Dept Genet, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
demographic inference; evolutionary modelling; generative adversarial network; simulated data; NATURAL-SELECTION; RECOMBINATION; LANDSCAPE; SAMPLES; MODEL; SITE;
D O I
10.1111/1755-0998.13386
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Population genetics relies heavily on simulated data for validation, inference and intuition. In particular, since the evolutionary 'ground truth' for real data is always limited, simulated data are crucial for training supervised machine learning methods. Simulation software can accurately model evolutionary processes but requires many hand-selected input parameters. As a result, simulated data often fail to mirror the properties of real genetic data, which limits the scope of methods that rely on it. Here, we develop a novel approach to estimating parameters in population genetic models that automatically adapts to data from any population. Our method, pg-gan, is based on a generative adversarial network that gradually learns to generate realistic synthetic data. We demonstrate that our method is able to recover input parameters in a simulated isolation-with-migration model. We then apply our method to human data from the 1000 Genomes Project and show that we can accurately recapitulate the features of real data.
引用
收藏
页码:2689 / 2705
页数:17
相关论文
共 52 条
[21]   Evidence for recent, population-specific evolution of the human mutation rate [J].
Harris, Kelley .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (11) :3439-3444
[22]   Rapid evolution of the human mutation spectrum [J].
Harris, Kelly ;
Pritchard, Jonathan K. .
ELIFE, 2017, 6
[23]   The landscape of recombination in African Americans [J].
Hinch, Anjali G. ;
Tandon, Arti ;
Patterson, Nick ;
Song, Yunli ;
Rohland, Nadin ;
Palmer, Cameron D. ;
Chen, Gary K. ;
Wang, Kai ;
Buxbaum, Sarah G. ;
Akylbekova, Ermeg L. ;
Aldrich, Melinda C. ;
Ambrosone, Christine B. ;
Amos, Christopher ;
Bandera, Elisa V. ;
Berndt, Sonja I. ;
Bernstein, Leslie ;
Blot, William J. ;
Bock, Cathryn H. ;
Boerwinkle, Eric ;
Cai, Qiuyin ;
Caporaso, Neil ;
Casey, Graham ;
Cupples, L. Adrienne ;
Deming, Sandra L. ;
Diver, W. Ryan ;
Divers, Jasmin ;
Fornage, Myriam ;
Gillanders, Elizabeth M. ;
Glessner, Joseph ;
Harris, Curtis C. ;
Hu, Jennifer J. ;
Ingles, Sue A. ;
Isaacs, William ;
John, Esther M. ;
Kao, W. H. Linda ;
Keating, Brendan ;
Kittles, Rick A. ;
Kolonel, Laurence N. ;
Larkin, Emma ;
Le Marchand, Loic ;
McNeill, Lorna H. ;
Millikan, Robert C. ;
Murphy, Adam ;
Musani, Solomon ;
Neslund-Dudas, Christine ;
Nyante, Sarah ;
Papanicolaou, George J. ;
Press, Michael F. ;
Psaty, Bruce M. ;
Reiner, Alex P. .
NATURE, 2011, 476 (7359) :170-U67
[24]   Generating samples under a Wright-Fisher neutral model of genetic variation [J].
Hudson, RR .
BIOINFORMATICS, 2002, 18 (02) :337-338
[25]  
HUDSON RR, 1992, GENETICS, V132, P583
[26]  
Johri Parul, 2021, bioRxiv, DOI 10.1101/2020.04.28.066365
[27]   Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes [J].
Kelleher, Jerome ;
Etheridge, Alison M. ;
McVean, Gilean .
PLOS COMPUTATIONAL BIOLOGY, 2016, 12 (05)
[28]   Discoal: flexible coalescent simulations with selection [J].
Kern, Andrew D. ;
Schrider, Daniel R. .
BIOINFORMATICS, 2016, 32 (24) :3839-3841
[29]   De novo mutations across 1,465 diverse genomes reveal mutational insights and reductions in the Amish founder population [J].
Kessler, Michael D. ;
Loesch, Douglas P. ;
Perry, James A. ;
Heard-Costa, Nancy L. ;
Taliun, Daniel ;
Cade, Brian E. ;
Wang, Heming ;
Daya, Michelle ;
Ziniti, John ;
Datta, Soma ;
Celedon, Juan C. ;
Soto-Quiros, Manuel E. ;
Avila, Lydiana ;
Weiss, Scott T. ;
Barnes, Kathleen ;
Redline, Susan S. ;
Vasan, Ramachandran S. ;
Johnson, Andrew D. ;
Mathias, Rasika A. ;
Hernandez, Ryan ;
Wilson, James G. ;
Nickerson, Deborah A. ;
Abecasis, Goncalo ;
Browning, Sharon R. ;
Zollner, Sebastian ;
O'Connell, Jeffrey R. ;
Mitchell, Braxton D. ;
O'Connora, Timothy D. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (05) :2560-2569
[30]   Ancestral processes with selection [J].
Krone, SM ;
Neuhauser, C .
THEORETICAL POPULATION BIOLOGY, 1997, 51 (03) :210-237