Quality control, imputation and analysis of genome-wide genotyping data from the Illumina HumanCoreExome microarray

被引:55
作者
Coleman, Jonathan R. I. [1 ]
Euesden, Jack [2 ]
Patel, Hamel [2 ,3 ]
Folarin, Amos A. [4 ]
Newhouse, Stephen [4 ]
Breen, Gerome [2 ,5 ]
机构
[1] MRC Social Genet & Dev Psychiat Ctr SGDP, London, England
[2] SGDP, London, England
[3] South London & Maudsley NHS Trust, Natl Inst Hlth Res, Biomed Res Ctr Mental Hlth, Bioinformat Core, London, England
[4] NIHR, BRC MH, Bioinformat Core, London, England
[5] NIHR, BRC MH, Genom & Biomarkers & BioResource Mental & Neurol, London, England
关键词
GWAS; methods; low-coverage microarray; imputation; analysis; ASSOCIATION; MODEL; PLINK;
D O I
10.1093/bfgp/elv037
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The decreasing cost of performing genome-wide association studies has made genomics widely accessible. However, there is a paucity of guidance for best practice in conducting such analyses. For the results of a study to be valid and replicable, multiple biases must be addressed in the course of data preparation and analysis. In addition, standardizing methods across small, independent studies would increase comparability and the potential for effective meta-analysis. This article provides a discussion of important aspects of quality control, imputation and analysis of genome-wide data from a low-coverage microarray, as well as a straight-forward guide to performing a genome-wide association study. A detailed protocol is provided online, with example scripts available at https://github.com/JoniColeman/gwas_scripts.
引用
收藏
页码:298 / 304
页数:7
相关论文
共 33 条
[1]   A haplotype map of the human genome [J].
Altshuler, D ;
Brooks, LD ;
Chakravarti, A ;
Collins, FS ;
Daly, MJ ;
Donnelly, P ;
Gibbs, RA ;
Belmont, JW ;
Boudreau, A ;
Leal, SM ;
Hardenbol, P ;
Pasternak, S ;
Wheeler, DA ;
Willis, TD ;
Yu, FL ;
Yang, HM ;
Zeng, CQ ;
Gao, Y ;
Hu, HR ;
Hu, WT ;
Li, CH ;
Lin, W ;
Liu, SQ ;
Pan, H ;
Tang, XL ;
Wang, J ;
Wang, W ;
Yu, J ;
Zhang, B ;
Zhang, QR ;
Zhao, HB ;
Zhao, H ;
Zhou, J ;
Gabriel, SB ;
Barry, R ;
Blumenstiel, B ;
Camargo, A ;
Defelice, M ;
Faggart, M ;
Goyette, M ;
Gupta, S ;
Moore, J ;
Nguyen, H ;
Onofrio, RC ;
Parkin, M ;
Roy, J ;
Stahl, E ;
Winchester, E ;
Ziaugra, L ;
Shen, Y .
NATURE, 2005, 437 (7063) :1299-1320
[2]   An integrated map of genetic variation from 1,092 human genomes [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Schmidt, Jeanette P. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Dinh, Huyen ;
Kovar, Christie ;
Lee, Sandra ;
Lewis, Lora ;
Muzny, Donna ;
Reid, Jeff ;
Wang, Min ;
Wang, Jun ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Li, Zhuo ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Su, Zhe ;
Tai, Shuaishuai ;
Tang, Meifang .
NATURE, 2012, 491 (7422) :56-65
[3]   Data quality control in genetic case-control association studies [J].
Anderson, Carl A. ;
Pettersson, Fredrik H. ;
Clarke, Geraldine M. ;
Cardon, Lon R. ;
Morris, Andrew P. ;
Zondervan, Krina T. .
NATURE PROTOCOLS, 2010, 5 (09) :1564-1573
[4]   GenABEL: an R library for genome-wide association analysis [J].
Aulchenko, Yurii S. ;
Ripke, Stephan ;
Isaacs, Aaron ;
Van Duijn, Cornelia M. .
BIOINFORMATICS, 2007, 23 (10) :1294-1296
[5]   ProbABEL package for genome-wide association analysis of imputed data [J].
Aulchenko, Yurii S. ;
Struchalin, Maksim V. ;
van Duijn, Cornelia M. .
BMC BIOINFORMATICS, 2010, 11
[6]   Second-generation PLINK: rising to the challenge of larger and richer datasets [J].
Chang, Christopher C. ;
Chow, Carson C. ;
Tellier, Laurent C. A. M. ;
Vattikuti, Shashaank ;
Purcell, Shaun M. ;
Lee, James J. .
GIGASCIENCE, 2015, 4
[7]   Genome-wide association studies: a primer [J].
Corvin, A. ;
Craddock, N. ;
Sullivan, P. F. .
PSYCHOLOGICAL MEDICINE, 2010, 40 (07) :1063-1077
[8]   A Critical Evaluation of Genomic Control Methods for Genetic Association Studies [J].
Dadd, Tony ;
Weale, Michael E. ;
Lewis, Cathryn M. .
GENETIC EPIDEMIOLOGY, 2009, 33 (04) :290-298
[9]   Practical aspects of imputation-driven meta-analysis of genome-wide association studies [J].
de Bakker, Paul I. W. ;
Ferreira, Manuel A. R. ;
Jia, Xiaoming ;
Neale, Benjamin M. ;
Raychaudhuri, Soumya ;
Voight, Benjamin F. .
HUMAN MOLECULAR GENETICS, 2008, 17 :R122-R128
[10]   zCall: a rare variant caller for array-based genotyping [J].
Goldstein, Jacqueline I. ;
Crenshaw, Andrew ;
Carey, Jason ;
Grant, George B. ;
Maguire, Jared ;
Fromer, Menachem ;
O'Dushlaine, Colm ;
Moran, Jennifer L. ;
Chambert, Kimberly ;
Stevens, Christine ;
Sklar, Pamela ;
Hultman, Christina M. ;
Purcell, Shaun ;
McCarroll, Steven A. ;
Sullivan, Patrick F. ;
Daly, Mark J. ;
Neale, Benjamin M. .
BIOINFORMATICS, 2012, 28 (19) :2543-2545