Network-based hierarchical population structure analysis for large genomic data sets

被引:8
作者
Greenbaum, Gili [1 ]
Rubin, Amir [2 ]
Templeton, Alan R. [3 ,4 ]
Rosenberg, Noah A. [1 ]
机构
[1] Stanford Univ, Dept Biol, Stanford, CA 94305 USA
[2] Ben Gurion Univ Negev, Dept Comp Sci, IL-8410501 Beer Sheva, Israel
[3] Washington Univ, Dept Biol, St Louis, MO 63130 USA
[4] Univ Haifa, Dept Evolutionary & Environm Ecol, IL-31905 Haifa, Israel
基金
美国国家卫生研究院;
关键词
ARABIDOPSIS-THALIANA; GENETIC-VARIATION; PROBABILISTIC MODELS; LOCAL ADAPTATION; WIDE PATTERNS; SNP DATA; INFERENCE; GENOTYPE; IDENTIFICATION; EIGENANALYSIS;
D O I
10.1101/gr.250092.119
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Analysis of population structure in natural populations using genetic data is a common practice in ecological and evolutionary studies. With large genomic data sets of populations now appearing more frequently across the taxonomic spectrum, it is becoming increasingly possible to reveal many hierarchical levels of structure, including fine-scale genetic clusters. To analyze these data sets, methods need to be appropriately suited to the challenges of extracting multilevel structure from whole-genome data. Here, we present a network-based approach for constructing population structure representations from genetic data. The use of community-detection algorithms from network theory generates a natural hierarchical perspective on the representation that the method produces. The method is computationally efficient, and it requires relatively few assumptions regarding the biological processes that underlie the data. We show the approach by analyzing population structure in the model plant species Arabidopsis thaliana and in human populations. These examples illustrate how network-based approaches for population structure analysis are well-suited to extracting valuable ecological and evolutionary information in the era of large genomic data sets.
引用
收藏
页码:2020 / 2033
页数:14
相关论文
共 79 条
  • [1] Fast model-based estimation of ancestry in unrelated individuals
    Alexander, David H.
    Novembre, John
    Lange, Kenneth
    [J]. GENOME RESEARCH, 2009, 19 (09) : 1655 - 1664
  • [2] Genomics and the future of conservation genetics
    Allendorf, Fred W.
    Hohenlohe, Paul A.
    Luikart, Gordon
    [J]. NATURE REVIEWS GENETICS, 2010, 11 (10) : 697 - 709
  • [3] 1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana
    Alonso-Blanco, Carlos
    Andrade, Jorge
    Becker, Claude
    Bemm, Felix
    Bergelson, Joy
    Borgwardt, Karsten M.
    Cao, Jun
    Chae, Eunyoung
    Dezwaan, Todd M.
    Ding, Wei
    Ecker, Joseph R.
    Exposito-Alonso, Moises
    Farlow, Ashley
    Fitz, Joffrey
    Gan, Xiangchao
    Grimm, Dominik G.
    Hancock, Angela M.
    Henz, Stefan R.
    Holm, Svante
    Horton, Matthew
    Jarsulic, Mike
    Kerstetter, Randall A.
    Korte, Arthur
    Korte, Pamela
    Lanz, Christa
    Lee, Cheng-Ruei
    Meng, Dazhe
    Michael, Todd P.
    Mott, Richard
    Muliyati, Ni Wayan
    Nagele, Thomas
    Nagler, Matthias
    Nizhynska, Viktoria
    Nordborg, Magnus
    Novikova, Polina Yu.
    Pico, F. Xavier
    Platzer, Alexander
    Rabanal, Fernando A.
    Rodriguez, Alex
    Rowan, Beth A.
    Salome, Patrice A.
    Schmid, Karl J.
    Schmitz, Robert J.
    Seren, Umit
    Sperone, Felice Gianluca
    Sudkamp, Mitchell
    Svardal, Hannes
    Tanzer, Matt M.
    Todd, Donald
    Volchenboum, Samuel L.
    [J]. CELL, 2016, 166 (02) : 481 - 491
  • [4] A global reference for human genetic variation
    Altshuler, David M.
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Donnelly, Peter
    Eichler, Evan E.
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Green, Eric D.
    Hurles, Matthew E.
    Knoppers, Bartha M.
    Korbel, Jan O.
    Lander, Eric S.
    Lee, Charles
    Lehrach, Hans
    Mardis, Elaine R.
    Marth, Gabor T.
    McVean, Gil A.
    Nickerson, Deborah A.
    Wang, Jun
    Wilson, Richard K.
    Boerwinkle, Eric
    Doddapaneni, Harsha
    Han, Yi
    Korchina, Viktoriya
    Kovar, Christie
    Lee, Sandra
    Muzny, Donna
    Reid, Jeffrey G.
    Zhu, Yiming
    Chang, Yuqi
    Feng, Qiang
    Fang, Xiaodong
    Guo, Xiaosen
    Jian, Min
    Jiang, Hui
    Jin, Xin
    Lan, Tianming
    Li, Guoqing
    Li, Jingxiang
    Li, Yingrui
    Liu, Shengmao
    Liu, Xiao
    Lu, Yao
    Ma, Xuedi
    Tang, Meifang
    Wang, Bo
    [J]. NATURE, 2015, 526 (7571) : 68 - +
  • [5] Armstrong E., 2019, Recent evolutionary history of tigers highlights contrasting roles of genetic drift and selection, P696146, DOI DOI 10.1101/696146
  • [6] Distribution of genetic variation within and among local populations of Arabidopsis thaliana over its species range
    Bakker, EG
    Stahl, EA
    Toomajian, C
    Nordborg, M
    Kreitman, M
    Bergelson, J
    [J]. MOLECULAR ECOLOGY, 2006, 15 (05) : 1405 - 1418
  • [7] Fast unfolding of communities in large networks
    Blondel, Vincent D.
    Guillaume, Jean-Loup
    Lambiotte, Renaud
    Lefebvre, Etienne
    [J]. JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
  • [8] HIGH-RESOLUTION OF HUMAN EVOLUTIONARY TREES WITH POLYMORPHIC MICROSATELLITES
    BOWCOCK, AM
    RUIZLINARES, A
    TOMFOHRDE, J
    MINCH, E
    KIDD, JR
    CAVALLISFORZA, LL
    [J]. NATURE, 1994, 368 (6470) : 455 - 457
  • [9] Separation of the largest eigenvalues in eigenanalysis of genotype data from discrete subpopulations
    Bryc, Katarzyna
    Bryc, Wlodek
    Silverstein, Jack W.
    [J]. THEORETICAL POPULATION BIOLOGY, 2013, 89 : 34 - 43
  • [10] Salinity Is an Agent of Divergent Selection Driving Local Adaptation of Arabidopsis to Coastal Habitats
    Busoms, Silvia
    Teres, Joana
    Huang, Xin-Yuan
    Bomblies, Kirsten
    Danku, John
    Douglas, Alex
    Weigel, Detlef
    Poschenrieder, Charlotte
    Salt, David E.
    [J]. PLANT PHYSIOLOGY, 2015, 168 (03) : 915 - 929