Effects of input data quantity on genome-wide association studies (GWAS)

被引:0
|
作者
Yan, Yan [1 ]
Burbridge, Connor [2 ]
Shi, Jinhong [1 ]
Liu, Juxin [3 ]
Kusalik, Anthony [1 ]
机构
[1] Univ Saskatchewan, Dept Comp Sci, 110 Sci Pl, Saskatoon, SK S7N 5C9, Canada
[2] Univ Saskatchewan, Global Inst Food Secur, 110 Gymnasium Pl, Saskatoon, SK S7N 0W9, Canada
[3] Univ Saskatchewan, Dept Math & Stat, McLean Hall, Saskatoon, SK S7N 5E6, Canada
关键词
GWAS; genome-wide association study; Arabidopsis thaliana; plant phenomics; plant genomics; PLINK; TASSEL; GAPIT; FaST-LMM; statistical power; input data quantity; epistasis; POWER; TOOL;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Many software packages have been developed for Genome-Wide Association Studies (GWAS) based on various statistical models. One key factor influencing the statistical reliability of GWAS is the amount of input data used. In this paper, we investigate how input data quantity influences output of four widely used GWAS programs, PLINK, TASSEL, GAPIT, and FaST-LMM, in the context of plant genomes and phenotypes. Both synthetic and real data are used. Evaluation is based on p- and q-values of output SNPs, and Kendall rank correlation between output SNP lists. Results show that for the same GWAS program, different Arabidopsis thaliana datasets demonstrate similar trends of rank correlation with varied input quantity, but differentiate on the numbers of SNPs passing a given p- or q-value threshold. We also show that variations in numbers of replicates influence the p-values of SNPs, but do not strongly affect the rank correlation.
引用
收藏
页码:19 / 43
页数:25
相关论文
共 50 条
  • [1] Comparing Four Genome-Wide Association Study (GWAS) Programs with Varied Input Data Quantity
    Yan, Yan
    Burbridge, Connor
    Shi, Jinhong
    Liu, Juxin
    Kusalik, Anthony
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 1802 - 1809
  • [2] A BAYESIAN GRAPHICAL MODEL FOR GENOME-WIDE ASSOCIATION STUDIES (GWAS)
    Briollais, Laurent
    Dobra, Adrian
    Liu, Jinnan
    Friedlander, Matt
    Ozcelik, Hilmi
    Massam, Helene
    ANNALS OF APPLIED STATISTICS, 2016, 10 (02): : 786 - 811
  • [3] An Introduction to Genome-Wide Association Studies: GWAS for Dummies
    Uitterlinden, A. G.
    SEMINARS IN REPRODUCTIVE MEDICINE, 2016, 34 (04) : 196 - 204
  • [4] Advancements and Prospects of Genome-Wide Association Studies (GWAS) in Maize
    Sahito, Javed Hussain
    Zhang, Hao
    Gishkori, Zeeshan Ghulam Nabi
    Ma, Chenhui
    Wang, Zhihao
    Ding, Dong
    Zhang, Xuehai
    Tang, Jihua
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (03)
  • [5] Data validation and statistical issues such as power and other considerations in genome-wide association study (GWAS)
    Tomita, Makoto
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2023, 15 (03)
  • [6] Data Quality Assessment in Genome Wide Association Studies (GWAS)
    Etcheverry, Lorena
    Marotta, Adriana
    Ruggia, Raul
    SISTEMAS Y TECNOLOGIAS DE INFORMACION, 2010, : 559 - 563
  • [7] Testing for Polygenic Effects in Genome-Wide Association Studies
    Pan, Wei
    Chen, Yue-Ming
    Wei, Peng
    GENETIC EPIDEMIOLOGY, 2015, 39 (04) : 306 - 316
  • [8] GWAS Central: a comprehensive resource for the comparison and interrogation of genome-wide association studies
    Beck, Tim
    Hastings, Robert K.
    Gollapudi, Sirisha
    Free, Robert C.
    Brookes, Anthony J.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2014, 22 (07) : 949 - 952
  • [9] Sleep duration: A review of genome-wide association studies (GWAS) in adults from 2007 to 2020
    Garfield, Victoria
    SLEEP MEDICINE REVIEWS, 2021, 56
  • [10] GWAS Central: a comprehensive resource for the comparison and interrogation of genome-wide association studies
    Tim Beck
    Robert K Hastings
    Sirisha Gollapudi
    Robert C Free
    Anthony J Brookes
    European Journal of Human Genetics, 2014, 22 : 949 - 952