Use of weighted reference panels based on empirical estimates of ancestry for capturing untyped variation

被引:17
作者
Egyud, Matthew R. L. [1 ,2 ,3 ]
Gajdos, Zofia K. Z. [1 ,2 ,4 ,5 ]
Butler, Johannah L. [1 ,2 ,4 ]
Tischfield, Sam [1 ,2 ,4 ]
Le Marchand, Loic [6 ]
Kolonel, Laurence N. [6 ]
Haiman, Christopher A. [7 ]
Henderson, Brian E. [7 ]
Hirschhorn, Joel N. [1 ,2 ,4 ,5 ]
机构
[1] Childrens Hosp, Program Genom, Boston, MA 02115 USA
[2] Childrens Hosp, Div Endocrinol, Boston, MA 02115 USA
[3] Boston Univ, Sch Med, Boston, MA 02118 USA
[4] Broad Inst MIT & Harvard, Program Med & Populat Genet, Cambridge, MA 02142 USA
[5] Harvard Univ, Sch Med, Dept Genet, Boston, MA 02115 USA
[6] Univ Hawaii, Canc Res Ctr, Honolulu, HI 96813 USA
[7] Univ So Calif, Keck Sch Med, Dept Prevent Med, Los Angeles, CA 90089 USA
关键词
MULTILOCUS GENOTYPE DATA; GENOME-WIDE ASSOCIATION; LINKAGE DISEQUILIBRIUM; GENETIC ASSOCIATION; HAPLOTYPE MAP; POPULATIONS; TRANSFERABILITY; POWER; SNPS;
D O I
10.1007/s00439-009-0627-8
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Many association methods use a subset of genotyped single nucleotide polymorphisms (SNPs) to capture or infer genotypes at other untyped SNPs. We and others previously showed that tag SNPs selected to capture common variation using data from The International HapMap Consortium (Nature 437:1299-1320, 2005), The International HapMap Consortium (Nature 449:851-861, 2007) could also capture variation in populations of similar ancestry to HapMap reference populations (de Bakker et al. in Nat Genet 38:1298-1303, 2006; Gonzalez-Neira et al. in Genome Res 16:323-330, 2006; Montpetit et al. in PLoS Genet 2:282-290, 2006; Mueller et al. in Am J Hum Genet 76:387-398, 2005). To capture variation in admixed populations or populations less similar to HapMap panels, a "cosmopolitan approach," in which all samples from HapMap are used as a single reference panel, was proposed. Here we refine this suggestion and show that use of a "weighted reference panel," constructed based on empirical estimates of ancestry in the target population (relative to available reference panels), is more efficient than the cosmopolitan approach. Weighted reference panels capture, on average, only slightly fewer common variants (minor allele frequency > 5%) than the cosmopolitan approach (mean r (2) = 0.977 vs. 0.989, 94.5% variation captured vs. 96.8% at r (2) > 0.8), across the five populations of the Multiethnic Cohort, but entail approximately 25% fewer tag SNPs per panel (average 538 vs. 718). These results extend a recent study in two Indian populations (Pemberton et al. in Ann Hum Genet 72:535-546, 2008). Weighted reference panels are potentially useful for both the selection of tag SNPs in diverse populations and perhaps in the design of reference panels for imputation of untyped genotypes in genome-wide association studies in admixed populations.
引用
收藏
页码:295 / 303
页数:9
相关论文
共 25 条
  • [1] A haplotype map of the human genome
    Altshuler, D
    Brooks, LD
    Chakravarti, A
    Collins, FS
    Daly, MJ
    Donnelly, P
    Gibbs, RA
    Belmont, JW
    Boudreau, A
    Leal, SM
    Hardenbol, P
    Pasternak, S
    Wheeler, DA
    Willis, TD
    Yu, FL
    Yang, HM
    Zeng, CQ
    Gao, Y
    Hu, HR
    Hu, WT
    Li, CH
    Lin, W
    Liu, SQ
    Pan, H
    Tang, XL
    Wang, J
    Wang, W
    Yu, J
    Zhang, B
    Zhang, QR
    Zhao, HB
    Zhao, H
    Zhou, J
    Gabriel, SB
    Barry, R
    Blumenstiel, B
    Camargo, A
    Defelice, M
    Faggart, M
    Goyette, M
    Gupta, S
    Moore, J
    Nguyen, H
    Onofrio, RC
    Parkin, M
    Roy, J
    Stahl, E
    Winchester, E
    Ziaugra, L
    Shen, Y
    [J]. NATURE, 2005, 437 (7063) : 1299 - 1320
  • [2] Haploview: analysis and visualization of LD and haplotype maps
    Barrett, JC
    Fry, B
    Maller, J
    Daly, MJ
    [J]. BIOINFORMATICS, 2005, 21 (02) : 263 - 265
  • [3] Genetic signatures of strong recent positive selection at the lactase gene
    Bersaglieri, T
    Sabeti, PC
    Patterson, N
    Vanderploeg, T
    Schaffner, SF
    Drake, JA
    Rhodes, M
    Reich, DE
    Hirschhorn, JN
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2004, 74 (06) : 1111 - 1120
  • [4] Cann HM, 2002, SCIENCE, V296, P261
  • [5] Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium
    Carlson, CS
    Eberle, MA
    Rieder, MJ
    Yi, Q
    Kruglyak, L
    Nickerson, DA
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2004, 74 (01) : 106 - 120
  • [6] Detecting disease associations due to linkage disequilibrium using haplotype tags: A class of tests and the determinants of statistical power
    Chapman, JM
    Cooper, JD
    Todd, JA
    Clayton, DG
    [J]. HUMAN HEREDITY, 2003, 56 (1-3) : 18 - 31
  • [7] Use of unphased multilocus genotype data in indirect association studies
    Clayton, D
    Chapman, J
    Cooper, J
    [J]. GENETIC EPIDEMIOLOGY, 2004, 27 (04) : 415 - 428
  • [8] Transferability of tag SNPs in genetic association studies in multiple populations
    de Bakker, Paul I. W.
    Burtt, Noel P.
    Graham, Robert R.
    Guiducci, Candace
    Yelensky, Roman
    Drake, Jared A.
    Bersaglieri, Todd
    Penney, Kathryn L.
    Butler, Johannah
    Young, Stanton
    Onofrio, Robert C.
    Lyon, Helen N.
    O Stram, Daniel
    Haiman, Christopher A.
    Freedman, Matthew L.
    Zhu, Xiaofeng
    Cooper, Richard
    Groop, Leif
    Kolonel, Laurence N.
    Henderson, Brian E.
    Daly, Mark J.
    Hirschhorn, Joel N.
    Altshuler, David
    [J]. NATURE GENETICS, 2006, 38 (11) : 1298 - 1303
  • [9] Efficiency and power in genetic association studies
    de Bakker, PIW
    Yelensky, R
    Pe'er, I
    Gabriel, SB
    Daly, MJ
    Altshuler, D
    [J]. NATURE GENETICS, 2005, 37 (11) : 1217 - 1223
  • [10] A second generation human haplotype map of over 3.1 million SNPs
    Frazer, Kelly A.
    Ballinger, Dennis G.
    Cox, David R.
    Hinds, David A.
    Stuve, Laura L.
    Gibbs, Richard A.
    Belmont, John W.
    Boudreau, Andrew
    Hardenbol, Paul
    Leal, Suzanne M.
    Pasternak, Shiran
    Wheeler, David A.
    Willis, Thomas D.
    Yu, Fuli
    Yang, Huanming
    Zeng, Changqing
    Gao, Yang
    Hu, Haoran
    Hu, Weitao
    Li, Chaohua
    Lin, Wei
    Liu, Siqi
    Pan, Hao
    Tang, Xiaoli
    Wang, Jian
    Wang, Wei
    Yu, Jun
    Zhang, Bo
    Zhang, Qingrun
    Zhao, Hongbin
    Zhao, Hui
    Zhou, Jun
    Gabriel, Stacey B.
    Barry, Rachel
    Blumenstiel, Brendan
    Camargo, Amy
    Defelice, Matthew
    Faggart, Maura
    Goyette, Mary
    Gupta, Supriya
    Moore, Jamie
    Nguyen, Huy
    Onofrio, Robert C.
    Parkin, Melissa
    Roy, Jessica
    Stahl, Erich
    Winchester, Ellen
    Ziaugra, Liuda
    Altshuler, David
    Shen, Yan
    [J]. NATURE, 2007, 449 (7164) : 851 - U3