A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals

被引:1287
作者
Browning, Brian L. [1 ]
Browning, Sharon R. [1 ]
机构
[1] Univ Auckland, Dept Stat, Auckland 1142, New Zealand
基金
英国惠康基金; 美国国家卫生研究院;
关键词
GENOME-WIDE ASSOCIATION; LARGE-SCALE; SUSCEPTIBILITY LOCI; METAANALYSIS; MODELS; POWER; MAP;
D O I
10.1016/j.ajhg.2009.01.005
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
We present methods for imputing data for ungenotyped markers and for inferring haplotype phase in large data sets of unrelated individuals and parent-offspring trios. Our methods make use of known haplotype phase when it is available, and our methods are computationally efficient so that the full information in large reference panels with thousands of individuals is utilized. We demonstrate that substantial gains in imputation accuracy accrue with increasingly large reference panel sizes, particularly when imputing low-frequency A variants, and that unphased reference panels can provide highly accurate genotype imputation. We place our methodology in a unified framework that enables the simultaneous use of unphased and phased data from trios and unrelated individuals in a single analysis. For unrelated individuals, our imputation methods produce well-calibrated posterior genotype probabilities and highly accurate allele frequency estimates. For trios, our haplotype-inference method is four orders of magnitude faster than the gold-standard PHASE program 14 and has excellent accuracy. Our methods enable genotype imputation to be performed with unphased trio or unrelated reference panels, thus accounting for haplotype-phase uncertainty in the reference panel. We present a useful measure of imputation accuracy, allelic R-2, and show that this measure can be estimated accurately from posterior genotype probabilities. Our methods are implemented in version 3.0 of the BEAGLE software package.
引用
收藏
页码:210 / 223
页数:14
相关论文
共 29 条
[1]   Evaluating the effects of imputation on the power, coverage, and cost efficiency of genome-wide SNP platforms [J].
Anderson, Carl A. ;
Pettersson, Fredrik H. ;
Barrett, Jeffrey C. ;
Zhuang, Joanna J. ;
Ragoussis, Jiannis ;
Cardon, Lon R. ;
Morris, Andrew P. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2008, 83 (01) :112-119
[2]   Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease [J].
Barrett, Jeffrey C. ;
Hansoul, Sarah ;
Nicolae, Dan L. ;
Cho, Judy H. ;
Duerr, Richard H. ;
Rioux, John D. ;
Brant, Steven R. ;
Silverberg, Mark S. ;
Taylor, Kent D. ;
Barmada, M. Michael ;
Bitton, Alain ;
Dassopoulos, Themistocles ;
Datta, Lisa Wu ;
Green, Todd ;
Griffiths, Anne M. ;
Kistner, Emily O. ;
Murtha, Michael T. ;
Regueiro, Miguel D. ;
Rotter, Jerome I. ;
Schumm, L. Philip ;
Steinhart, A. Hillary ;
Targan, Stephan R. ;
Xavier, Ramnik J. ;
Libioulle, Cecile ;
Sandor, Cynthia ;
Lathrop, Mark ;
Belaiche, Jacques ;
Dewit, Olivier ;
Gut, Ivo ;
Heath, Simon ;
Laukens, Debby ;
Mni, Myriam ;
Rutgeerts, Paul ;
Van Gossum, Andre ;
Zelenika, Diana ;
Franchimont, Denis ;
Hugot, Jean-Pierre ;
de Vos, Martine ;
Vermeire, Severine ;
Louis, Edouard ;
Cardon, Lon R. ;
Anderson, Carl A. ;
Drummond, Hazel ;
Nimmo, Elaine ;
Ahmad, Tariq ;
Prescott, Natalie J. ;
Onnie, Clive M. ;
Fisher, Sheila A. ;
Marchini, Jonathan ;
Ghori, Jilur .
NATURE GENETICS, 2008, 40 (08) :955-962
[3]   Haplotypic analysis of wellcome trust case control consortium data [J].
Browning, Brian L. ;
Browning, Sharon R. .
HUMAN GENETICS, 2008, 123 (03) :273-280
[4]   Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering [J].
Browning, Sharon R. ;
Browning, Brian L. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) :1084-1097
[5]   Multilocus association mapping using variable-length Markov chains [J].
Browning, Sharon R. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2006, 78 (06) :903-913
[6]   Missing data imputation and haplotype phase inference for genome-wide association studies [J].
Browning, Sharon R. .
HUMAN GENETICS, 2008, 124 (05) :439-450
[7]   Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls [J].
Burton, Paul R. ;
Clayton, David G. ;
Cardon, Lon R. ;
Craddock, Nick ;
Deloukas, Panos ;
Duncanson, Audrey ;
Kwiatkowski, Dominic P. ;
McCarthy, Mark I. ;
Ouwehand, Willem H. ;
Samani, Nilesh J. ;
Todd, John A. ;
Donnelly, Peter ;
Barrett, Jeffrey C. ;
Davison, Dan ;
Easton, Doug ;
Evans, David ;
Leung, Hin-Tak ;
Marchini, Jonathan L. ;
Morris, Andrew P. ;
Spencer, Chris C. A. ;
Tobin, Martin D. ;
Attwood, Antony P. ;
Boorman, James P. ;
Cant, Barbara ;
Everson, Ursula ;
Hussey, Judith M. ;
Jolley, Jennifer D. ;
Knight, Alexandra S. ;
Koch, Kerstin ;
Meech, Elizabeth ;
Nutland, Sarah ;
Prowse, Christopher V. ;
Stevens, Helen E. ;
Taylor, Niall C. ;
Walters, Graham R. ;
Walker, Neil M. ;
Watkins, Nicholas A. ;
Winzer, Thilo ;
Jones, Richard W. ;
McArdle, Wendy L. ;
Ring, Susan M. ;
Strachan, David P. ;
Pembrey, Marcus ;
Breen, Gerome ;
St Clair, David ;
Caesar, Sian ;
Gordon-Smith, Katherine ;
Jones, Lisa ;
Fraser, Christine ;
Green, Elain K. .
NATURE, 2007, 447 (7145) :661-678
[8]   Population structure, differential bias and genomic control in a large-scale, case-control association study [J].
Clayton, DG ;
Walker, NM ;
Smyth, DJ ;
Pask, R ;
Cooper, JD ;
Maier, LM ;
Smink, LJ ;
Lam, AC ;
Ovington, NR ;
Stevens, HE ;
Nutland, S ;
Howson, JMM ;
Faham, M ;
Moorhead, M ;
Jones, HB ;
Falkowski, M ;
Hardenbol, P ;
Willis, TD ;
Todd, JA .
NATURE GENETICS, 2005, 37 (11) :1243-1246
[9]   Practical aspects of imputation-driven meta-analysis of genome-wide association studies [J].
de Bakker, Paul I. W. ;
Ferreira, Manuel A. R. ;
Jia, Xiaoming ;
Neale, Benjamin M. ;
Raychaudhuri, Soumya ;
Voight, Benjamin F. .
HUMAN MOLECULAR GENETICS, 2008, 17 :R122-R128
[10]   A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3