A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots

被引:297
作者
Lawson, Daniel J. [1 ]
Van Dorp, Lucy [2 ,3 ]
Falush, Daniel [4 ]
机构
[1] Univ Bristol, Integrat Epidemiol Unit, Populat Hlth Sci, Bristol BS8 1TH, Avon, England
[2] UCL, Genet Inst UGI, London WC1E 6BT, England
[3] UCL, Ctr Math & Phys Life Sci & Expt Biol CoMPLEX, London WC1E 6BT, England
[4] Univ Bath, Milner Ctr Evolut, Bath BA2 7AY, Avon, England
来源
NATURE COMMUNICATIONS | 2018年 / 9卷
基金
英国医学研究理事会; 英国惠康基金;
关键词
MULTILOCUS GENOTYPE DATA; POPULATION-STRUCTURE; GENETIC-STRUCTURE; PROGRAM STRUCTURE; HISTORY; INFERENCE; CLUSTERS; INDIVIDUALS; SIMULATION; SEQUENCE;
D O I
10.1038/s41467-018-05257-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Genetic clustering algorithms, implemented in programs such as STRUCTURE and ADMIXTURE, have been used extensively in the characterisation of individuals and populations based on genetic data. A successful example is the reconstruction of the genetic history of African Americans as a product of recent admixture between highly differentiated populations. Histories can also be reconstructed using the same procedure for groups that do not have admixture in their recent history, where recent genetic drift is strong or that deviate in other ways from the underlying inference model. Unfortunately, such histories can be misleading. We have implemented an approach, badMIXTURE, to assess the goodness of fit of the model using the ancestry "palettes" estimated by CHROMOPAINTER and apply it to both simulated data and real case studies. Combining these complementary analyses with additional methods that are designed to test specific hypotheses allows a richer and more robust analysis of recent demographic history.
引用
收藏
页数:11
相关论文
共 33 条
  • [1] Fast model-based estimation of ancestry in unrelated individuals
    Alexander, David H.
    Novembre, John
    Lange, Kenneth
    [J]. GENOME RESEARCH, 2009, 19 (09) : 1655 - 1664
  • [2] The influence of family groups on inferences made with the program Structure
    Anderson, E. C.
    Dunham, K. K.
    [J]. MOLECULAR ECOLOGY RESOURCES, 2008, 8 (06) : 1219 - 1229
  • [3] Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure
    Basu, Analabha
    Sarkar-Roy, Neeta
    Majumder, Partha P.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (06) : 1594 - 1599
  • [4] Fast and flexible simulation of DNA sequence data
    Chen, Gary K.
    Marjoram, Paul
    Wall, Jeffrey D.
    [J]. GENOME RESEARCH, 2009, 19 (01) : 136 - 142
  • [5] Analysis of Population Structure: A Unifying Framework and Novel Methods Based on Sparse Factor Analysis
    Engelhardt, Barbara E.
    Stephens, Matthew
    [J]. PLOS GENETICS, 2010, 6 (09):
  • [6] Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study
    Evanno, G
    Regnaut, S
    Goudet, J
    [J]. MOLECULAR ECOLOGY, 2005, 14 (08) : 2611 - 2620
  • [7] Falush D, 2003, GENETICS, V164, P1567
  • [8] Using spatial Bayesian methods to determine the genetic structure of a continuously distributed population: clusters or isolation by distance?
    Frantz, A. C.
    Cellina, S.
    Krier, A.
    Schley, L.
    Burke, T.
    [J]. JOURNAL OF APPLIED ECOLOGY, 2009, 46 (02) : 493 - 505
  • [9] The genetic structure of Pacific islanders
    Friedlaender, Jonathan S.
    Friedlaender, Francoise R.
    Reed, Floyd A.
    Kidd, Kenneth K.
    Kidd, Judith R.
    Chambers, Geoffrey K.
    Lea, Rodney A.
    Loo, Jun-Hun
    Koki, George
    Hodgson, Jason A.
    Merriwether, D. Andrew
    Weber, James L.
    [J]. PLOS GENETICS, 2008, 4 (01): : 0173 - 0190
  • [10] A Genetic Atlas of Human Admixture History
    Hellenthal, Garrett
    Busby, George B. J.
    Band, Gavin
    Wilson, James F.
    Capelli, Cristian
    Falush, Daniel
    Myers, Simon
    [J]. SCIENCE, 2014, 343 (6172) : 747 - 751