A genome-wide study of Hardy-Weinberg equilibrium with next generation sequence data

被引:70
作者
Graffelman, Jan [1 ,2 ]
Jain, Deepti [2 ]
Weir, Bruce [2 ]
机构
[1] Univ Politecn Cataluna, Dept Stat & Operat Res, Avinguda Diagonal 647, E-08028 Barcelona, Spain
[2] Univ Washington, Dept Biostat, Univ Tower,15th Floor,4333 Brooklyn Ave, Seattle, WA 98105 USA
基金
美国国家卫生研究院;
关键词
GENOTYPING ERRORS; GENETIC-MARKERS; EXACT TESTS; ASSOCIATION;
D O I
10.1007/s00439-017-1786-7
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Statistical tests for Hardy-Weinberg equilibrium have been an important tool for detecting genotyping errors in the past, and remain important in the quality control of next generation sequence data. In this paper, we analyze complete chromosomes of the 1000 genomes project by using exact test procedures for autosomal and X-chromosomal variants. We find that the rate of disequilibrium largely exceeds what might be expected by chance alone for all chromosomes. Observed disequilibrium is, in about 60% of the cases, due to heterozygote excess. We suggest that most excess disequilibrium can be explained by sequencing problems, and hypothesize mechanisms that can explain exceptional heterozygosities. We report higher rates of disequilibrium for the MHC region on chromosome 6, regions flanking centromeres and p-arms of acrocentric chromosomes. We also detected long-range haplotypes and areas with incidental high disequilibrium. We report disequilibrium to be related to read depth, with variants having extreme read depths being more likely to be out of equilibrium. Disequilibrium rates were found to be 11 times higher in segmental duplications and simple tandem repeat regions. The variants with significant disequilibrium are seen to be concentrated in these areas. For next generation sequence data, Hardy-Weinberg disequilibrium seems to be a major indicator for copy number variation.
引用
收藏
页码:727 / 741
页数:15
相关论文
共 27 条
  • [1] A global reference for human genetic variation
    Altshuler, David M.
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Donnelly, Peter
    Eichler, Evan E.
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Green, Eric D.
    Hurles, Matthew E.
    Knoppers, Bartha M.
    Korbel, Jan O.
    Lander, Eric S.
    Lee, Charles
    Lehrach, Hans
    Mardis, Elaine R.
    Marth, Gabor T.
    McVean, Gil A.
    Nickerson, Deborah A.
    Wang, Jun
    Wilson, Richard K.
    Boerwinkle, Eric
    Doddapaneni, Harsha
    Han, Yi
    Korchina, Viktoriya
    Kovar, Christie
    Lee, Sandra
    Muzny, Donna
    Reid, Jeffrey G.
    Zhu, Yiming
    Chang, Yuqi
    Feng, Qiang
    Fang, Xiaodong
    Guo, Xiaosen
    Jian, Min
    Jiang, Hui
    Jin, Xin
    Lan, Tianming
    Li, Guoqing
    Li, Jingxiang
    Li, Yingrui
    Liu, Shengmao
    Liu, Xiao
    Lu, Yao
    Ma, Xuedi
    Tang, Meifang
    Wang, Bo
    [J]. NATURE, 2015, 526 (7571) : 68 - +
  • [2] [Anonymous], 1997, PRINCIPLES POPULATIO
  • [3] Recent segmental duplications in the human genome
    Bailey, JA
    Gu, ZP
    Clark, RA
    Reinert, K
    Samonte, RV
    Schwartz, S
    Adams, MD
    Myers, EW
    Li, PW
    Eichler, EE
    [J]. SCIENCE, 2002, 297 (5583) : 1003 - 1007
  • [4] Complete sequence and gene map of a human major histocompatibility complex
    Beck, S
    Geraghty, D
    Inoko, H
    Rowen, L
    Aguado, B
    Bahram, S
    Campbell, RD
    Forbes, SA
    Guillaudeux, T
    Hood, L
    Horton, R
    Janer, M
    Jasoni, C
    Madan, A
    Milne, S
    Neville, M
    Oka, A
    Qin, S
    Ribas-Despuig, G
    Rogers, J
    Shiina, T
    Spies, T
    Tamiya, G
    Tashiro, H
    Trowsdale, J
    Vu, Q
    Williams, L
    Yamazaki, M
    [J]. NATURE, 1999, 401 (6756) : 921 - 923
  • [5] Copy number variants and genetic traits: closer to the resolution of phenotypic to genotypic variability
    Beckmann, Jacques S.
    Estivill, Xavier
    Antonarakis, Stylianos E.
    [J]. NATURE REVIEWS GENETICS, 2007, 8 (08) : 639 - 646
  • [6] Tandem repeats finder: a program to analyze DNA sequences
    Benson, G
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (02) : 573 - 580
  • [7] Identification of Null Alleles and Deletions from SNP Genotypes for an Intercross Between Domestic and Wild Chickens
    Crooks, Lucy
    Carlborg, Orjan
    Marklund, Stefan
    Johansson, Anna M.
    [J]. G3-GENES GENOMES GENETICS, 2013, 3 (08): : 1253 - 1260
  • [8] The variant call format and VCFtools
    Danecek, Petr
    Auton, Adam
    Abecasis, Goncalo
    Albers, Cornelis A.
    Banks, Eric
    DePristo, Mark A.
    Handsaker, Robert E.
    Lunter, Gerton
    Marth, Gabor T.
    Sherry, Stephen T.
    McVean, Gilean
    Durbin, Richard
    [J]. BIOINFORMATICS, 2011, 27 (15) : 2156 - 2158
  • [9] A COMPARISON OF TESTS FOR HARDY-WEINBERG EQUILIBRIUM
    EMIGH, TH
    [J]. BIOMETRICS, 1980, 36 (04) : 627 - 642
  • [10] A second generation human haplotype map of over 3.1 million SNPs
    Frazer, Kelly A.
    Ballinger, Dennis G.
    Cox, David R.
    Hinds, David A.
    Stuve, Laura L.
    Gibbs, Richard A.
    Belmont, John W.
    Boudreau, Andrew
    Hardenbol, Paul
    Leal, Suzanne M.
    Pasternak, Shiran
    Wheeler, David A.
    Willis, Thomas D.
    Yu, Fuli
    Yang, Huanming
    Zeng, Changqing
    Gao, Yang
    Hu, Haoran
    Hu, Weitao
    Li, Chaohua
    Lin, Wei
    Liu, Siqi
    Pan, Hao
    Tang, Xiaoli
    Wang, Jian
    Wang, Wei
    Yu, Jun
    Zhang, Bo
    Zhang, Qingrun
    Zhao, Hongbin
    Zhao, Hui
    Zhou, Jun
    Gabriel, Stacey B.
    Barry, Rachel
    Blumenstiel, Brendan
    Camargo, Amy
    Defelice, Matthew
    Faggart, Maura
    Goyette, Mary
    Gupta, Supriya
    Moore, Jamie
    Nguyen, Huy
    Onofrio, Robert C.
    Parkin, Melissa
    Roy, Jessica
    Stahl, Erich
    Winchester, Ellen
    Ziaugra, Liuda
    Altshuler, David
    Shen, Yan
    [J]. NATURE, 2007, 449 (7164) : 851 - U3