Extending the use of GWAS data by combining data from different genetic platforms

被引:5
作者
van Iperen, E. P. A. [1 ,2 ]
Hovingh, G. K. [3 ]
Asselbergs, F. W. [1 ,4 ,5 ]
Zwinderman, A. H. [2 ]
机构
[1] Netherlands Heart Inst, Durrer Ctr Cardiovasc Res, Utrecht, Netherlands
[2] Acad Med Ctr, Dept Clin Epidemiol Biostat & Bioinformat, Amsterdam, Netherlands
[3] Acad Med Ctr, Dept Vasc Med, Amsterdam, Netherlands
[4] UMC, Div Heart & Lungs, Dept Cardiol, Utrecht, Netherlands
[5] UCL, Fac Populat Hlth Sci, Inst Cardiovasc Sci, London, England
来源
PLOS ONE | 2017年 / 12卷 / 02期
关键词
GENOME-WIDE ASSOCIATION; GENOTYPE IMPUTATION; LOCI; VARIANTS;
D O I
10.1371/journal.pone.0172082
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background In the past decade many Genome-wide Association Studies (GWAS) were performed that discovered new associations between single-nucleotide polymorphisms (SNPs) and various phenotypes. Imputation methods are widely used in GWAS. They facilitate the phenotype association with variants that are not directly genotyped. Imputation methods can also be used to combine and analyse data genotyped on different genotyping arrays. In this study we investigated the imputation quality and efficiency of two different approaches of combining GWAS data from different genotyping platforms. We investigated whether combining data from different platforms before the actual imputation performs better than combining the data from different platforms after imputation. Methods In total 979 unique individuals from the AMC-PAS cohort were genotyped on 3 different platforms. A total of 706 individuals were genotyped on the MetaboChip, a total of 757 individuals were genotyped on the 50K gene-centric Human CVD BeadChip, and a total of 955 individuals were genotyped on the HumanExome chip. A total of 397 individuals were genotyped on all 3 individual platforms. After pre-imputation quality control (QC), Minimac in combination with MaCH was used for the imputation of all samples with the 1,000 genomes reference panel. All imputed markers with an r(2) value of <0.3 were excluded in our post-imputation QC. Results A total of 397 individuals were genotyped on all three platforms. All three datasets were carefully matched on strand, SNP ID and genomic coordinates. This resulted in a dataset of 979 unique individuals and a total of 258,925 unique markers. A total of 4,117,036 SNPs were available when imputation was performed before merging the three datasets. A total of 3,933,494 SNPs were available when imputation was done on the combined set. Our results suggest that imputation of individual datasets before merging performs slightly better than after combining the different datasets. Conclusions Imputation of datasets genotyped by different platforms before merging generates more SNPs than imputation after putting the datasets together.
引用
收藏
页数:11
相关论文
共 17 条
  • [1] Large-scale association analysis identifies new risk loci for coronary artery disease
    Deloukas, Panos
    Kanoni, Stavroula
    Willenborg, Christina
    Farrall, Martin
    Assimes, Themistocles L.
    Thompson, John R.
    Ingelsson, Erik
    Saleheen, Danish
    Erdmann, Jeanette
    Goldstein, Benjamin A.
    Stirrups, Kathleen
    Koenig, Inke R.
    Cazier, Jean-Baptiste
    Johansson, Asa
    Hall, Alistair S.
    Lee, Jong-Young
    Willer, Cristen J.
    Chambers, John C.
    Esko, Tonu
    Folkersen, Lasse
    Goel, Anuj
    Grundberg, Elin
    Havulinna, Aki S.
    Ho, Weang K.
    Hopewell, Jemma C.
    Eriksson, Niclas
    Kleber, Marcus E.
    Kristiansson, Kati
    Lundmark, Per
    Lyytikainen, Leo-Pekka
    Rafelt, Suzanne
    Shungin, Dmitry
    Strawbridge, Rona J.
    Thorleifsson, Gudmar
    Tikkanen, Emmi
    Van Zuydam, Natalie
    Voight, Benjamin F.
    Waite, Lindsay L.
    Zhang, Weihua
    Ziegler, Andreas
    Absher, Devin
    Altshuler, David
    Balmforth, Anthony J.
    Barroso, Ines
    Braund, Peter S.
    Burgdorf, Christof
    Claudi-Boehm, Simone
    Cox, David
    Dimitriou, Maria
    Do, Ron
    [J]. NATURE GENETICS, 2013, 45 (01) : 25 - U52
  • [2] Best Practices and Joint Calling of the HumanExome BeadChip: The CHARGE Consortium
    Grove, Megan L.
    Yu, Bing
    Cochran, Barbara J.
    Haritunians, Talin
    Bis, Joshua C.
    Taylor, Kent D.
    Hansen, Mark
    Borecki, Ingrid B.
    Cupples, L. Adrienne
    Fornage, Myriam
    Gudnason, Vilmundur
    Harris, Tamara B.
    Kathiresan, Sekar
    Kraaij, Robert
    Launer, Lenore J.
    Levy, Daniel
    Liu, Yongmei
    Mosley, Thomas
    Peloso, Gina M.
    Psaty, Bruce M.
    Rich, Stephen S.
    Rivadeneira, Fernando
    Siscovick, David S.
    Smith, Albert V.
    Uitterlinden, Andre
    van Duijn, Cornelia M.
    Wilson, James G.
    O'Donnell, Christopher J.
    Rotter, Jerome I.
    Boerwinkle, Eric
    [J]. PLOS ONE, 2013, 8 (07):
  • [3] Fast and accurate genotype imputation in genome-wide association studies through pre-phasing
    Howie, Bryan
    Fuchsberger, Christian
    Stephens, Matthew
    Marchini, Jonathan
    Abecasis, Goncalo R.
    [J]. NATURE GENETICS, 2012, 44 (08) : 955 - +
  • [4] A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies
    Howie, Bryan N.
    Donnelly, Peter
    Marchini, Jonathan
    [J]. PLOS GENETICS, 2009, 5 (06)
  • [5] Concept, Design and Implementation of a Cardiovascular Gene-Centric 50 K SNP Array for Large-Scale Genomic Association Studies
    Keating, Brendan J.
    Tischfield, Sam
    Murray, Sarah S.
    Bhangale, Tushar
    Price, Thomas S.
    Glessner, Joseph T.
    Galver, Luana
    Barrett, Jeffrey C.
    Grant, Struan F. A.
    Farlow, Deborah N.
    Chandrupatla, Hareesh R.
    Hansen, Mark
    Ajmal, Saad
    Papanicolaou, George J.
    Guo, Yiran
    Li, Mingyao
    DerOhannessian, Stephanie
    de Bakker, Paul I. W.
    Bailey, Swneke D.
    Montpetit, Alexandre
    Edmondson, Andrew C.
    Taylor, Kent
    Gai, Xiaowu
    Wang, Susanna S.
    Fornage, Myriam
    Shaikh, Tamim
    Groop, Leif
    Boehnke, Michael
    Hall, Alistair S.
    Hattersley, Andrew T.
    Frackelton, Edward
    Patterson, Nick
    Chiang, Charleston W. K.
    Kim, Cecelia E.
    Fabsitz, Richard R.
    Ouwehand, Willem
    Price, Alkes L.
    Munroe, Patricia
    Caulfield, Mark
    Drake, Thomas
    Boerwinkle, Eric
    Reich, David
    Whitehead, A. Stephen
    Cappola, Thomas P.
    Samani, Nilesh J.
    Lusis, A. Jake
    Schadt, Eric
    Wilson, James G.
    Koenig, Wolfgang
    McCarthy, Mark I.
    [J]. PLOS ONE, 2008, 3 (10):
  • [6] MaCH: Using Sequence and Genotype Data to Estimate Haplotypes and Unobserved Genotypes
    Li, Yun
    Willer, Cristen J.
    Ding, Jun
    Scheet, Paul
    Abecasis, Goncalo R.
    [J]. GENETIC EPIDEMIOLOGY, 2010, 34 (08) : 816 - 834
  • [7] PLINK: A tool set for whole-genome association and population-based linkage analyses
    Purcell, Shaun
    Neale, Benjamin
    Todd-Brown, Kathe
    Thomas, Lori
    Ferreira, Manuel A. R.
    Bender, David
    Maller, Julian
    Sklar, Pamela
    de Bakker, Paul I. W.
    Daly, Mark J.
    Sham, Pak C.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (03) : 559 - 575
  • [8] A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants
    Scott, Laura J.
    Mohlke, Karen L.
    Bonnycastle, Lori L.
    Willer, Cristen J.
    Li, Yun
    Duren, William L.
    Erdos, Michael R.
    Stringham, Heather M.
    Chines, Peter S.
    Jackson, Anne U.
    Prokunina-Olsson, Ludmila
    Ding, Chia-Jen
    Swift, Amy J.
    Narisu, Narisu
    Hu, Tianle
    Pruim, Randall
    Xiao, Rui
    Li, Xiao-Yi
    Conneely, Karen N.
    Riebow, Nancy L.
    Sprau, Andrew G.
    Tong, Maurine
    White, Peggy P.
    Hetrick, Kurt N.
    Barnhart, Michael W.
    Bark, Craig W.
    Goldstein, Janet L.
    Watkins, Lee
    Xiang, Fang
    Saramies, Jouko
    Buchanan, Thomas A.
    Watanabe, Richard M.
    Valle, Timo T.
    Kinnunen, Leena
    Abecasis, Gonalo R.
    Pugh, Elizabeth W.
    Doheny, Kimberly F.
    Bergman, Richard N.
    Tuomilehto, Jaakko
    Collins, Francis S.
    Boehnke, Michael
    [J]. SCIENCE, 2007, 316 (5829) : 1341 - 1345
  • [9] Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways
    Scott, Robert A.
    Lagou, Vasiliki
    Welch, Ryan P.
    Wheeler, Eleanor
    Montasser, May E.
    Luan, Jian'an
    Maegi, Reedik
    Strawbridge, Rona J.
    Rehnberg, Emil
    Gustafsson, Stefan
    Kanoni, Stavroula
    Rasmussen-Torvik, Laura J.
    Yengo, Loic
    Lecoeur, Cecile
    Shungin, Dmitry
    Sanna, Serena
    Sidore, Carlo
    Johnson, Paul C. D.
    Jukema, J. Wouter
    Johnson, Toby
    Mahajan, Anubha
    Verweij, Niek
    Thorleifsson, Gudmar
    Hottenga, Jouke-Jan
    Shah, Sonia
    Smith, Albert V.
    Sennblad, Bengt
    Gieger, Christian
    Salo, Perttu
    Perola, Markus
    Timpson, Nicholas J.
    Evans, David M.
    St Pourcain, Beate
    Wu, Ying
    Andrews, Jeanette S.
    Hui, Jennie
    Bielak, Lawrence F.
    Zhao, Wei
    Horikoshi, Momoko
    Navarro, Pau
    Isaacs, Aaron
    O'Connell, Jeffrey R.
    Stirrups, Kathleen
    Vitart, Veronique
    Hayward, Caroline
    Esko, Tonu
    Mihailov, Evelin
    Fraser, Ross M.
    Fall, Tove
    Voight, Benjamin F.
    [J]. NATURE GENETICS, 2012, 44 (09) : 991 - +
  • [10] The effect of genome-wide association scan quality control on imputation outcome for common variants
    Southam, Lorraine
    Panoutsopoulou, Kalliope
    Rayner, N. William
    Chapman, Kay
    Durrant, Caroline
    Ferreira, Teresa
    Arden, Nigel
    Carr, Andrew
    Deloukas, Panos
    Doherty, Michael
    Loughlin, John
    McCaskie, Andrew
    Ollier, William E. R.
    Ralston, Stuart
    Spector, Timothy D.
    Valdes, Ana M.
    Wallis, Gillian A.
    Wilkinson, J. Mark
    Marchini, Jonathan
    Zeggini, Eleftheria
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2011, 19 (05) : 610 - 614