Simultaneous Genotype Calling and Haplotype Phasing Improves Genotype Accuracy and Reduces False-Positive Associations for Genome-wide Association Studies

被引:153
作者
Browning, Brian L. [1 ]
Yu, Zhaoxia [2 ]
机构
[1] Univ Auckland, Dept Stat, Auckland 1142, New Zealand
[2] Univ Calif Irvine, Dept Stat, Irvine, CA 92697 USA
基金
英国惠康基金;
关键词
HIDDEN MARKOV-MODELS; SUSCEPTIBILITY LOCI; LARGE-SCALE; UNRELATED INDIVIDUALS; ARRAY DATA; INFERENCE; IMPUTATION; ALGORITHM; DISEASE; POLYMORPHISMS;
D O I
10.1016/j.ajhg.2009.11.004
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
We present a novel method for Simultaneous genotype calling and haplotype-phase inference. Our method employs the computationally efficient BEAGLE haplotype-frequency model, which can be applied to large-scale studies with millions of markers and thousands of samples. We compare genotype calls made with our method to genotype calls made with the BIRDSEED, CHIAMO, GenCall, and ILLUMINUS genotype-calling methods, using genotype data from the Illumina 550K and Affymetrix 500K arrays. We show that our method has higher genotype-call accuracy and yields fewer uncalled genotypes than competing methods. We perform single-marker analysis of data from the Wellcome Trust Case Control Consortium bipolar disorder and type 2 diabetes Studies. I or bipolar disorder, the genotype calls in the original study yield 25 markers with apparent false-positive association with bipolar disorder at a p < 10(-7) significance level, whereas genotype calls made with our method yield no associated markers at this significance threshold. Conversely, for markers with replicated association with type 2 diabetes, there is good concordance between genotype calls used in the original study and calls made by our method. Results from single-marker and haplotypic analysis of our method's genotype calls for the bipolar disorder study indicate that our method is highly effective at eliminating genotyping artifacts that cause false-positive associations in genome-wide association Studies. Our new genotype-calling methods are implemented in the BEAGLE and BEAGLECALL software packages.
引用
收藏
页码:847 / 861
页数:15
相关论文
共 39 条
  • [31] Genotyping and inflated type I error rate in genome-wide association case/control studies
    Sampson, Joshua N.
    Zhao, Hongyu
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [32] Linkage Disequilibrium-Based Quality Control for Large-Scale Genetic Studies
    Scheet, Paul
    Stephens, Matthew
    [J]. PLOS GENETICS, 2008, 4 (08)
  • [33] Genome-wide association and meta-analysis of bipolar disorder in individuals of European ancestry
    Scott, Laura J.
    Muglia, Pierandrea
    Kong, Xiangyang Q.
    Guan, Weihua
    Flickinger, Matthew
    Upmanyu, Ruchi
    Tozzi, Federica
    Li, Jun Z.
    Burmeisterg, Margit
    Absher, Devin
    Thompson, Robert C.
    Francks, Clyde
    Meng, Fan
    Antoniades, Athos
    Southwick, Audrey M.
    Schatzberg, Alan F.
    Bunney, William E.
    Barchask, Jack D.
    Jones, Edward G.
    Day, Richard
    Matthews, Keith
    McGuffin, Peter
    Strauss, John S.
    Kennedy, James L.
    Middleton, Lefkos
    Roses, Allen D.
    Watson, Stanley J.
    Vincent, John B.
    Myers, Richard M.
    Farmer, Ann E.
    Akil, Huda
    Burns, Daniel K.
    Boehnke, Michael
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (18) : 7501 - 7506
  • [34] A genotype calling algorithm for the Illumina BeadArray platform
    Teo, Yik Y.
    Inouye, Michael
    Small, Kerrin S.
    Gwilliam, Rhian
    Deloukas, Panagiotis
    Kwiatkowski, Dominic P.
    Clark, Taane G.
    [J]. BIOINFORMATICS, 2007, 23 (20) : 2741 - 2746
  • [35] Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes
    Todd, John A.
    Walker, Neil M.
    Cooper, Jason D.
    Smyth, Deborah J.
    Downes, Kate
    Plagnol, Vincent
    Bailey, Rebecca
    Nejentsev, Sergey
    Field, Sarah F.
    Payne, Felicity
    Lowe, Christopher E.
    Szeszko, Jeffrey S.
    Hafler, Jason P.
    Zeitels, Lauren
    Yang, Jennie H. M.
    Vella, Adrian
    Nutland, Sarah
    Stevens, Helen E.
    Schuilenburg, Helen
    Coleman, Gillian
    Maisuria, Meeta
    Meadows, William
    Smink, Luc J.
    Healy, Barry
    Burren, Oliver S.
    Lam, Alex A. C.
    Ovington, Nigel R.
    Allen, James
    Adlem, Ellen
    Leung, Hin-Tak
    Wallace, Chris
    Howson, Joanna M. M.
    Guja, Cristian
    Ionescu-Tirgoviste, Constantin
    Simmonds, Matthew J.
    Heward, Joanne M.
    Gough, Stephen C. L.
    Dunger, David B.
    Wicker, Linda S.
    Clayton, David G.
    [J]. NATURE GENETICS, 2007, 39 (07) : 857 - 864
  • [36] A note on exact tests of Hardy-Weinberg equilibrium
    Wigginton, JE
    Cutler, DJ
    Abecasis, GR
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2005, 76 (05) : 887 - 893
  • [37] Genotype determination for polymorphisms in linkage disequilibrium
    Yu, Zhaoxia
    Garner, Chad
    Ziogas, Argyrios
    Anton-Culver, Hoda
    Schaid, Daniel J.
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [38] Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes
    Zeggini, Eleftheria
    Scott, Laura J.
    Saxena, Richa
    Voight, Benjamin F.
    Marchini, Jonathan L.
    Hu, Tianle
    de Bakker, Paul I. W.
    Abecasis, Goncalo R.
    Almgren, Peter
    Andersen, Gitte
    Ardlie, Kristin
    Bostroem, Kristina Bengtsson
    Bergman, Richard N.
    Bonnycastle, Lori L.
    Borch-Johnsen, Knut
    Burtt, Noel P.
    Chen, Hong
    Chines, Peter S.
    Daly, Mark J.
    Deodhar, Parimal
    Ding, Chia-Jen
    Doney, Alex S. F.
    Duren, William L.
    Elliott, Katherine S.
    Erdos, Michael R.
    Frayling, Timothy M.
    Freathy, Rachel M.
    Gianniny, Lauren
    Grallert, Harald
    Grarup, Niels
    Groves, Christopher J.
    Guiducci, Candace
    Hansen, Torben
    Herder, Christian
    Hitman, Graham A.
    Hughes, Thomas E.
    Isomaa, Bo
    Jackson, Anne U.
    Jorgensen, Torben
    Kong, Augustine
    Kubalanza, Kari
    Kuruvilla, Finny G.
    Kuusisto, Johanna
    Langenberg, Claudia
    Lango, Hana
    Lauritzen, Torsten
    Li, Yun
    Lindgren, Cecilia M.
    Lyssenko, Valeriya
    Marvelle, Amanda F.
    [J]. NATURE GENETICS, 2008, 40 (05) : 638 - 645
  • [39] Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes
    Zeggini, Eleftheria
    Weedon, Michael N.
    Lindgren, Cecilia M.
    Frayling, Timothy M.
    Elliott, Katherine S.
    Lango, Hana
    Timpson, Nicholas J.
    Perry, John R. B.
    Rayner, Nigel W.
    Freathy, Rachel M.
    Barrett, Jeffrey C.
    Shields, Beverley
    Morris, Andrew P.
    Ellard, Sian
    Groves, Christopher J.
    Harries, Lorna W.
    Marchini, Jonathan L.
    Owen, Katharine R.
    Knight, Beatrice
    Cardon, Lon R.
    Walker, Mark
    Hitman, Graham A.
    Morris, Andrew D.
    Doney, Alex S. F.
    McCarthy, Mark I.
    Hattersley, Andrew T.
    [J]. SCIENCE, 2007, 316 (5829) : 1336 - 1341