Quality Control Procedures for Genome-Wide Association Studies

被引:11
|
作者
Truong, Van Q. [1 ,2 ]
Woerner, Jakob A. [1 ,2 ]
Cherlin, Tess A. [3 ]
Bradford, Yuki [2 ]
Lucas, Anastasia M. [1 ,2 ]
Okeh, Chelsea C. [3 ]
Shivakumar, Manu K. [1 ,2 ]
Hui, Daniel H. [1 ,2 ]
Kumar, Rachit [1 ,2 ]
Pividori, Milton [2 ]
Jones, S. Chris [1 ,2 ]
Bossa, Abigail C. [2 ]
Turner, Stephen D. [4 ]
Ritchie, Marylyn D. [2 ]
Verma, Shefali S. [3 ]
机构
[1] Univ Penn, Perelman Sch Med, Genom & Computat Biol Grad Grp, Philadelphia, PA USA
[2] Univ Penn, Perelman Sch Med, Dept Genet, Philadelphia, PA USA
[3] Univ Penn, Perelman Sch Med, Dept Pathol & Lab Med, Philadelphia, PA 19104 USA
[4] Signature Sci LLC, Charlottesville, VA USA
来源
CURRENT PROTOCOLS | 2022年 / 2卷 / 11期
关键词
1000 Genomes Project; biobanks; electronic health records (EHR); genome-wide association studies; genomics; genotype imputation; GWAS; quality control (QC); MULTILOCUS GENOTYPE DATA; POPULATION-STRUCTURE; IMPUTATION; INFERENCE; STRATIFICATION; EFFICIENT; DATABASE; MODEL; LOCI;
D O I
10.1002/cpz1.603
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of many complex diseases. Regardless of the context, the practical utility of this information ultimately depends upon the quality of the data used for statistical analyses. Quality control (QC) procedures for GWAS are constantly evolving. Here, we enumerate some of the challenges in QC of genotyped GWAS data and describe the approaches involving genotype imputation of a sample dataset along with post-imputation quality assurance, thereby minimizing potential bias and error in GWAS results. We discuss common issues associated with QC of the GWAS data (genotyped and imputed), including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We provide detailed guidelines along with a sample dataset to suggest current best practices and discuss areas of ongoing and future research. (c) 2022 Wiley Periodicals LLC.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Mixed logistic regression in genome-wide association studies
    Milet, Jacqueline
    Courtin, David
    Garcia, Andre
    Perdry, Herve
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [2] False discovery rate control in genome-wide association studies with population structure
    Sesia, Matteo
    Bates, Stephen
    Candes, Emmanuel
    Marchini, Jonathan
    Sabatti, Chiara
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2021, 118 (40)
  • [3] A tutorial on conducting genome-wide association studies: Quality control and statistical analysis
    Marees, Andries T.
    de Kluiver, Hilde
    Stringer, Sven
    Vorspan, Florence
    Curis, Emmanuel
    Marie-Claire, Cynthia
    Derks, Eske M.
    INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2018, 27 (02)
  • [4] Quality Control and Quality Assurance in Genotypic Data for Genome-Wide Association Studies
    Laurie, Cathy C.
    Doheny, Kimberly F.
    Mirel, Daniel B.
    Pugh, Elizabeth W.
    Bierut, Laura J.
    Bhangale, Tushar
    Boehm, Frederick
    Caporaso, Neil E.
    Cornelis, Marilyn C.
    Edenberg, Howard J.
    Gabriel, Stacy B.
    Harris, Emily L.
    Hu, Frank B.
    Jacobs, Kevin B.
    Kraft, Peter
    Landi, Maria Teresa
    Lumley, Thomas
    Manolio, Teri A.
    McHugh, Caitlin
    Painter, Ian
    Paschall, Justin
    Rice, John P.
    Rice, Kenneth M.
    Zheng, Xiuwen
    Weir, Bruce S.
    GENETIC EPIDEMIOLOGY, 2010, 34 (06) : 591 - 602
  • [5] Statistical methods for genome-wide association studies
    Wang, Maggie Haitian
    Cordell, Heather J.
    Van Steen, Kristel
    SEMINARS IN CANCER BIOLOGY, 2019, 55 : 53 - 60
  • [6] Moving Beyond Genome-Wide Association Studies
    Glazer, Nicole L.
    CIRCULATION-CARDIOVASCULAR GENETICS, 2011, 4 (01) : 91 - 93
  • [7] Weighted multiple testing procedures in genome-wide association studies
    Obry, Ludivine
    Dalmasso, Cyril
    PEERJ, 2023, 11
  • [8] Genome-Wide Association Studies: Quality Control and Population-Based Measures
    Ziegler, Andreas
    GENETIC EPIDEMIOLOGY, 2009, 33 : S45 - S50
  • [9] Enrichment of statistical power for genome-wide association studies
    Li, Meng
    Liu, Xiaolei
    Bradbury, Peter
    Yu, Jianming
    Zhang, Yuan-Ming
    Todhunter, Rory J.
    Buckler, Edward S.
    Zhang, Zhiwu
    BMC BIOLOGY, 2014, 12
  • [10] Meta-analysis in genome-wide association studies
    Zeggini, E.
    Ioannidis, J. P. A.
    PHARMACOGENOMICS, 2009, 10 (02) : 191 - 201