Robust relationship inference in genome-wide association studies

被引:1929
作者
Manichaikul, Ani [1 ,2 ]
Mychaleckyj, Josyf C. [1 ]
Rich, Stephen S. [1 ]
Daly, Kathy [3 ]
Sale, Michele [1 ,4 ,5 ]
Chen, Wei-Min [1 ,2 ]
机构
[1] Univ Virginia, Ctr Publ Hlth Genom, Charlottesville, VA 22903 USA
[2] Univ Virginia, Dept Publ Hlth Sci, Div Biostatist & Epidemiol, Charlottesville, VA USA
[3] Univ Minnesota, Dept Otolaryngol, Minneapolis, MN USA
[4] Univ Virginia, Dept Med, Charlottesville, VA USA
[5] Univ Virginia, Dept Biochem & Mol Genet, Charlottesville, VA USA
关键词
STRATIFICATION; HERITABILITY; LINKAGE; FAMILY; TESTS; MODEL;
D O I
10.1093/bioinformatics/btq559
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Genome-wide association studies (GWASs) have been widely used to map loci contributing to variation in complex traits and risk of diseases in humans. Accurate specification of familial relationships is crucial for family-based GWAS, as well as in population-based GWAS with unknown ( or unrecognized) family structure. The family structure in a GWAS should be routinely investigated using the SNP data prior to the analysis of population structure or phenotype. Existing algorithms for relationship inference have a major weakness of estimating allele frequencies at each SNP from the entire sample, under a strong assumption of homogeneous population structure. This assumption is often untenable. Results: Here, we present a rapid algorithm for relationship inference using high-throughput genotype data typical of GWAS that allows the presence of unknown population substructure. The relationship of any pair of individuals can be precisely inferred by robust estimation of their kinship coefficient, independent of sample composition or population structure (sample invariance). We present simulation experiments to demonstrate that the algorithm has sufficient power to provide reliable inference on millions of unrelated pairs and thousands of relative pairs (up to 3rd-degree relationships). Application of our robust algorithm to HapMap and GWAS datasets demonstrates that it performs properly even under extreme population stratification, while algorithms assuming a homogeneous population give systematically biased results. Our extremely efficient implementation performs relationship inference on millions of pairs of individuals in a matter of minutes, dozens of times faster than the most efficient existing algorithm known to us.
引用
收藏
页码:2867 / 2873
页数:7
相关论文
共 19 条
[1]   Merlin-rapid analysis of dense genetic maps using sparse gene flow trees [J].
Abecasis, GR ;
Cherny, SS ;
Cookson, WO ;
Cardon, LR .
NATURE GENETICS, 2002, 30 (01) :97-101
[2]   GRR: graphical representation of relationship errors [J].
Abecasis, GR ;
Cherny, SS ;
Cookson, WOC ;
Cardon, LR .
BIOINFORMATICS, 2001, 17 (08) :742-743
[3]   A haplotype map of the human genome [J].
Altshuler, D ;
Brooks, LD ;
Chakravarti, A ;
Collins, FS ;
Daly, MJ ;
Donnelly, P ;
Gibbs, RA ;
Belmont, JW ;
Boudreau, A ;
Leal, SM ;
Hardenbol, P ;
Pasternak, S ;
Wheeler, DA ;
Willis, TD ;
Yu, FL ;
Yang, HM ;
Zeng, CQ ;
Gao, Y ;
Hu, HR ;
Hu, WT ;
Li, CH ;
Lin, W ;
Liu, SQ ;
Pan, H ;
Tang, XL ;
Wang, J ;
Wang, W ;
Yu, J ;
Zhang, B ;
Zhang, QR ;
Zhao, HB ;
Zhao, H ;
Zhou, J ;
Gabriel, SB ;
Barry, R ;
Blumenstiel, B ;
Camargo, A ;
Defelice, M ;
Faggart, M ;
Goyette, M ;
Gupta, S ;
Moore, J ;
Nguyen, H ;
Onofrio, RC ;
Parkin, M ;
Roy, J ;
Stahl, E ;
Winchester, E ;
Ziaugra, L ;
Shen, Y .
NATURE, 2005, 437 (7063) :1299-1320
[4]   Accurate inference of relationships in sib-pair linkage studies [J].
Boehnke, M ;
Cox, NJ .
AMERICAN JOURNAL OF HUMAN GENETICS, 1997, 61 (02) :423-429
[5]   Family-based association tests for genomewide association scans [J].
Chen, Wei-Min ;
Abecasis, Goncalo R. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) :913-926
[6]   A Generalized Family-Based Association Test for Dichotomous Traits [J].
Chen, Wei-Min ;
Manichaikul, Ani ;
Rich, Stephen S. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2009, 85 (03) :364-376
[7]   A general and accurate approach for computing the statistical power of the transmission disequilibrium test for complex disease genes [J].
Chen, WM ;
Deng, HW .
GENETIC EPIDEMIOLOGY, 2001, 21 (01) :53-67
[8]   Case-Control Association Testing in the Presence of Unknown Relationships [J].
Choi, Yoonha ;
Wijsman, Ellen M. ;
Weir, Bruce S. .
GENETIC EPIDEMIOLOGY, 2009, 33 (08) :668-678
[9]   Chronic and recurrent otitis media: A genome scan for susceptibility loci [J].
Daly, KA ;
Brown, WM ;
Segade, F ;
Bowden, DW ;
Keats, BJ ;
Lindgren, BR ;
Levine, SC ;
Rich, SS .
AMERICAN JOURNAL OF HUMAN GENETICS, 2004, 75 (06) :988-997
[10]   Variance component model to account for sample structure in genome-wide association studies [J].
Kang, Hyun Min ;
Sul, Jae Hoon ;
Service, Susan K. ;
Zaitlen, Noah A. ;
Kong, Sit-yee ;
Freimer, Nelson B. ;
Sabatti, Chiara ;
Eskin, Eleazar .
NATURE GENETICS, 2010, 42 (04) :348-U110