Quality Control Procedures for Genome-Wide Association Studies

被引:11
|
作者
Truong, Van Q. [1 ,2 ]
Woerner, Jakob A. [1 ,2 ]
Cherlin, Tess A. [3 ]
Bradford, Yuki [2 ]
Lucas, Anastasia M. [1 ,2 ]
Okeh, Chelsea C. [3 ]
Shivakumar, Manu K. [1 ,2 ]
Hui, Daniel H. [1 ,2 ]
Kumar, Rachit [1 ,2 ]
Pividori, Milton [2 ]
Jones, S. Chris [1 ,2 ]
Bossa, Abigail C. [2 ]
Turner, Stephen D. [4 ]
Ritchie, Marylyn D. [2 ]
Verma, Shefali S. [3 ]
机构
[1] Univ Penn, Perelman Sch Med, Genom & Computat Biol Grad Grp, Philadelphia, PA USA
[2] Univ Penn, Perelman Sch Med, Dept Genet, Philadelphia, PA USA
[3] Univ Penn, Perelman Sch Med, Dept Pathol & Lab Med, Philadelphia, PA 19104 USA
[4] Signature Sci LLC, Charlottesville, VA USA
来源
CURRENT PROTOCOLS | 2022年 / 2卷 / 11期
关键词
1000 Genomes Project; biobanks; electronic health records (EHR); genome-wide association studies; genomics; genotype imputation; GWAS; quality control (QC); MULTILOCUS GENOTYPE DATA; POPULATION-STRUCTURE; IMPUTATION; INFERENCE; STRATIFICATION; EFFICIENT; DATABASE; MODEL; LOCI;
D O I
10.1002/cpz1.603
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of many complex diseases. Regardless of the context, the practical utility of this information ultimately depends upon the quality of the data used for statistical analyses. Quality control (QC) procedures for GWAS are constantly evolving. Here, we enumerate some of the challenges in QC of genotyped GWAS data and describe the approaches involving genotype imputation of a sample dataset along with post-imputation quality assurance, thereby minimizing potential bias and error in GWAS results. We discuss common issues associated with QC of the GWAS data (genotyped and imputed), including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We provide detailed guidelines along with a sample dataset to suggest current best practices and discuss areas of ongoing and future research. (c) 2022 Wiley Periodicals LLC.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Quality Control and Quality Assurance in Genotypic Data for Genome-Wide Association Studies
    Laurie, Cathy C.
    Doheny, Kimberly F.
    Mirel, Daniel B.
    Pugh, Elizabeth W.
    Bierut, Laura J.
    Bhangale, Tushar
    Boehm, Frederick
    Caporaso, Neil E.
    Cornelis, Marilyn C.
    Edenberg, Howard J.
    Gabriel, Stacy B.
    Harris, Emily L.
    Hu, Frank B.
    Jacobs, Kevin B.
    Kraft, Peter
    Landi, Maria Teresa
    Lumley, Thomas
    Manolio, Teri A.
    McHugh, Caitlin
    Painter, Ian
    Paschall, Justin
    Rice, John P.
    Rice, Kenneth M.
    Zheng, Xiuwen
    Weir, Bruce S.
    GENETIC EPIDEMIOLOGY, 2010, 34 (06) : 591 - 602
  • [2] A quality control algorithm for filtering SNPs in genome-wide association studies
    Pongpanich, Monnat
    Sullivan, Patrick F.
    Tzeng, Jung-Ying
    BIOINFORMATICS, 2010, 26 (14) : 1731 - 1737
  • [3] Weighted multiple testing procedures in genome-wide association studies
    Obry, Ludivine
    Dalmasso, Cyril
    PEERJ, 2023, 11
  • [4] Genome-Wide Association Studies: Quality Control and Population-Based Measures
    Ziegler, Andreas
    GENETIC EPIDEMIOLOGY, 2009, 33 : S45 - S50
  • [5] A tutorial on conducting genome-wide association studies: Quality control and statistical analysis
    Marees, Andries T.
    de Kluiver, Hilde
    Stringer, Sven
    Vorspan, Florence
    Curis, Emmanuel
    Marie-Claire, Cynthia
    Derks, Eske M.
    INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2018, 27 (02)
  • [6] Genome-wide association studies
    Nature Reviews Methods Primers, 1
  • [7] Genome-wide association studies
    Willson, Joseph
    NATURE REVIEWS METHODS PRIMERS, 2021, 1 (01):
  • [8] Genome-Wide Association Studies
    Guo, Xiuqing
    Rotter, Jerome I.
    JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2019, 322 (17): : 1705 - 1706
  • [9] SQC: secure quality control for meta-analysis of genome-wide association studies
    Huang, Zhicong
    Lin, Huang
    Fellay, Jacques
    Kutalik, Zoltan
    Hubaux, Jean-Pierre
    BIOINFORMATICS, 2017, 33 (15) : 2273 - 2280
  • [10] A Novel Single Nucleotide Polymorphisms Quality Control Method in Genome-Wide Association Studies
    Sun, Yuliang
    Li, Renfa
    Liao, Bo
    Li, Xiong
    Cao, Zhi
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2014, 11 (07) : 1649 - 1652