Design and coverage of high throughput genotyping arrays optimized for individuals of East Asian, African American, and Latino race/ethnicity using imputation and a novel hybrid SNP selection algorithm

被引:121
作者
Hoffmann, Thomas J. [1 ,5 ]
Zhan, Yiping [2 ]
Kvale, Mark N. [1 ]
Hesselson, Stephanie E. [1 ]
Gollub, Jeremy [2 ]
Iribarren, Carlos [3 ]
Lu, Yontao [2 ]
Mei, Gangwu [2 ]
Purdy, Matthew M. [2 ]
Quesenberry, Charles [3 ]
Rowell, Sarah [3 ]
Shapero, Michael H. [2 ]
Smethurst, David [3 ]
Somkin, Carol P. [3 ]
Van den Eeden, Stephen K. [3 ]
Walter, Larry [3 ]
Webster, Teresa [2 ]
Whitmer, Rachel A. [3 ]
Finn, Andrea [2 ]
Schaefer, Catherine [3 ]
Kwok, Pui-Yan [1 ,4 ]
Risch, Neil [1 ,3 ,5 ]
机构
[1] Univ Calif San Francisco, Inst Human Genet, San Francisco, CA 94143 USA
[2] Affymetrix Inc, Santa Clara, CA USA
[3] Kaiser Permanente No Calif Div Res, Oakland, CA USA
[4] Univ Calif San Francisco, Cardiovasc Res Inst, San Francisco, CA 94143 USA
[5] Univ Calif San Francisco, Dept Epidemiol & Biostat, San Francisco, CA 94143 USA
基金
美国国家卫生研究院;
关键词
Microarray; Genome-wide association study; Coverage; Imputation; Single nucleotide polymorphism; Throughput; GENOME-WIDE ASSOCIATION; HUMAN-POPULATIONS; COMMON; ADMIXTURE; GENETICS; DISEASES; LOCI; MAP;
D O I
10.1016/j.ygeno.2011.08.007
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Four custom Axiom genotyping arrays were designed for a genome-wide association (GWA) study of 100,000 participants from the Kaiser Permanente Research Program on Genes, Environment and Health. The array optimized for individuals of European race/ethnicity was previously described. Here we detail the development of three additional microarrays optimized for individuals of East Asian, African American, and Latino race/ethnicity. For these arrays, we decreased redundancy of high-performing SNPs to increase SNP capacity. The East Asian array was designed using greedy pairwise SNP selection. However, removing SNPs from the target set based on imputation coverage is more efficient than pairwise tagging. Therefore, we developed a novel hybrid SNP selection method for the African American and Latino arrays utilizing rounds of greedy pairwise SNP selection, followed by removal from the target set of SNPs covered by imputation. The arrays provide excellent genome-wide coverage and are valuable additions for large-scale GWA studies. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:422 / 430
页数:9
相关论文
共 30 条
  • [1] Hundreds of variants clustered in genomic loci and biological pathways affect human height
    Allen, Hana Lango
    Estrada, Karol
    Lettre, Guillaume
    Berndt, Sonja I.
    Weedon, Michael N.
    Rivadeneira, Fernando
    Willer, Cristen J.
    Jackson, Anne U.
    Vedantam, Sailaja
    Raychaudhuri, Soumya
    Ferreira, Teresa
    Wood, Andrew R.
    Weyant, Robert J.
    Segre, Ayellet V.
    Speliotes, Elizabeth K.
    Wheeler, Eleanor
    Soranzo, Nicole
    Park, Ju-Hyun
    Yang, Jian
    Gudbjartsson, Daniel
    Heard-Costa, Nancy L.
    Randall, Joshua C.
    Qi, Lu
    Smith, Albert Vernon
    Maegi, Reedik
    Pastinen, Tomi
    Liang, Liming
    Heid, Iris M.
    Luan, Jian'an
    Thorleifsson, Gudmar
    Winkler, Thomas W.
    Goddard, Michael E.
    Lo, Ken Sin
    Palmer, Cameron
    Workalemahu, Tsegaselassie
    Aulchenko, Yurii S.
    Johansson, Asa
    Zillikens, M. Carola
    Feitosa, Mary F.
    Esko, Tonu
    Johnson, Toby
    Ketkar, Shamika
    Kraft, Peter
    Mangino, Massimo
    Prokopenko, Inga
    Absher, Devin
    Albrecht, Eva
    Ernst, Florian
    Glazer, Nicole L.
    Hayward, Caroline
    [J]. NATURE, 2010, 467 (7317) : 832 - 838
  • [2] A map of human genome variation from population-scale sequencing
    Altshuler, David
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Collins, Francis S.
    De la Vega, Francisco M.
    Donnelly, Peter
    Egholm, Michael
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Knoppers, Bartha M.
    Lander, Eric S.
    Lehrach, Hans
    Mardis, Elaine R.
    McVean, Gil A.
    Nickerson, DebbieA.
    Peltonen, Leena
    Schafer, Alan J.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Deiros, David
    Metzker, Mike
    Muzny, Donna
    Reid, Jeff
    Wheeler, David
    Wang, Jun
    Li, Jingxiang
    Jian, Min
    Li, Guoqing
    Li, Ruiqiang
    Liang, Huiqing
    Tian, Geng
    Wang, Bo
    Wang, Jian
    Wang, Wei
    Yang, Huanming
    Zhang, Xiuqing
    Zheng, Huisong
    Lander, Eric S.
    Altshuler, David L.
    Ambrogio, Lauren
    Bloom, Toby
    Cibulskis, Kristian
    Fennell, Tim J.
    Gabriel, Stacey B.
    [J]. NATURE, 2010, 467 (7319) : 1061 - 1073
  • [3] Integrating common and rare genetic variation in diverse human populations
    Altshuler, David M.
    Gibbs, Richard A.
    Peltonen, Leena
    Dermitzakis, Emmanouil
    Schaffner, Stephen F.
    Yu, Fuli
    Bonnen, Penelope E.
    de Bakker, Paul I. W.
    Deloukas, Panos
    Gabriel, Stacey B.
    Gwilliam, Rhian
    Hunt, Sarah
    Inouye, Michael
    Jia, Xiaoming
    Palotie, Aarno
    Parkin, Melissa
    Whittaker, Pamela
    Chang, Kyle
    Hawes, Alicia
    Lewis, Lora R.
    Ren, Yanru
    Wheeler, David
    Muzny, Donna Marie
    Barnes, Chris
    Darvishi, Katayoon
    Hurles, Matthew
    Korn, Joshua M.
    Kristiansson, Kati
    Lee, Charles
    McCarroll, Steven A.
    Nemesh, James
    Keinan, Alon
    Montgomery, Stephen B.
    Pollack, Samuela
    Price, Alkes L.
    Soranzo, Nicole
    Gonzaga-Jauregui, Claudia
    Anttila, Verneri
    Brodeur, Wendy
    Daly, Mark J.
    Leslie, Stephen
    McVean, Gil
    Moutsianas, Loukas
    Nguyen, Huy
    Zhang, Qingrun
    Ghori, Mohammed J. R.
    McGinnis, Ralph
    McLaren, William
    Takeuchi, Fumihiko
    Grossman, Sharon R.
    [J]. NATURE, 2010, 467 (7311) : 52 - 58
  • [4] Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering
    Browning, Sharon R.
    Browning, Brian L.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) : 1084 - 1097
  • [5] Latino populations: A unique opportunity for the study of race, genetics, and social environment in epidemiological research
    Burchard, EG
    Borrell, LN
    Choudhry, S
    Naqvi, M
    Tsai, HJ
    Rodriguez-Santana, JR
    Chapela, R
    Rogers, SD
    Mei, R
    Rodriguez-Cintron, W
    Arena, JF
    Kittles, R
    Perez-Stable, EJ
    Ziv, E
    Risch, N
    [J]. AMERICAN JOURNAL OF PUBLIC HEALTH, 2005, 95 (12) : 2161 - 2168
  • [6] Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls
    Burton, Paul R.
    Clayton, David G.
    Cardon, Lon R.
    Craddock, Nick
    Deloukas, Panos
    Duncanson, Audrey
    Kwiatkowski, Dominic P.
    McCarthy, Mark I.
    Ouwehand, Willem H.
    Samani, Nilesh J.
    Todd, John A.
    Donnelly, Peter
    Barrett, Jeffrey C.
    Davison, Dan
    Easton, Doug
    Evans, David
    Leung, Hin-Tak
    Marchini, Jonathan L.
    Morris, Andrew P.
    Spencer, Chris C. A.
    Tobin, Martin D.
    Attwood, Antony P.
    Boorman, James P.
    Cant, Barbara
    Everson, Ursula
    Hussey, Judith M.
    Jolley, Jennifer D.
    Knight, Alexandra S.
    Koch, Kerstin
    Meech, Elizabeth
    Nutland, Sarah
    Prowse, Christopher V.
    Stevens, Helen E.
    Taylor, Niall C.
    Walters, Graham R.
    Walker, Neil M.
    Watkins, Nicholas A.
    Winzer, Thilo
    Jones, Richard W.
    McArdle, Wendy L.
    Ring, Susan M.
    Strachan, David P.
    Pembrey, Marcus
    Breen, Gerome
    St Clair, David
    Caesar, Sian
    Gordon-Smith, Katherine
    Jones, Lisa
    Fraser, Christine
    Green, Elain K.
    [J]. NATURE, 2007, 447 (7145) : 661 - 678
  • [7] Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium
    Carlson, CS
    Eberle, MA
    Rieder, MJ
    Yi, Q
    Kruglyak, L
    Nickerson, DA
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2004, 74 (01) : 106 - 120
  • [8] Efficiency and power in genetic association studies
    de Bakker, PIW
    Yelensky, R
    Pe'er, I
    Gabriel, SB
    Daly, MJ
    Altshuler, D
    [J]. NATURE GENETICS, 2005, 37 (11) : 1217 - 1223
  • [9] The use of multiple measurements in taxonomic problems
    Fisher, RA
    [J]. ANNALS OF EUGENICS, 1936, 7 : 179 - 188
  • [10] A second generation human haplotype map of over 3.1 million SNPs
    Frazer, Kelly A.
    Ballinger, Dennis G.
    Cox, David R.
    Hinds, David A.
    Stuve, Laura L.
    Gibbs, Richard A.
    Belmont, John W.
    Boudreau, Andrew
    Hardenbol, Paul
    Leal, Suzanne M.
    Pasternak, Shiran
    Wheeler, David A.
    Willis, Thomas D.
    Yu, Fuli
    Yang, Huanming
    Zeng, Changqing
    Gao, Yang
    Hu, Haoran
    Hu, Weitao
    Li, Chaohua
    Lin, Wei
    Liu, Siqi
    Pan, Hao
    Tang, Xiaoli
    Wang, Jian
    Wang, Wei
    Yu, Jun
    Zhang, Bo
    Zhang, Qingrun
    Zhao, Hongbin
    Zhao, Hui
    Zhou, Jun
    Gabriel, Stacey B.
    Barry, Rachel
    Blumenstiel, Brendan
    Camargo, Amy
    Defelice, Matthew
    Faggart, Maura
    Goyette, Mary
    Gupta, Supriya
    Moore, Jamie
    Nguyen, Huy
    Onofrio, Robert C.
    Parkin, Melissa
    Roy, Jessica
    Stahl, Erich
    Winchester, Ellen
    Ziaugra, Liuda
    Altshuler, David
    Shen, Yan
    [J]. NATURE, 2007, 449 (7164) : 851 - U3