Effects of input data quantity on genome-wide association studies (GWAS)

被引:0
|
作者
Yan, Yan [1 ]
Burbridge, Connor [2 ]
Shi, Jinhong [1 ]
Liu, Juxin [3 ]
Kusalik, Anthony [1 ]
机构
[1] Univ Saskatchewan, Dept Comp Sci, 110 Sci Pl, Saskatoon, SK S7N 5C9, Canada
[2] Univ Saskatchewan, Global Inst Food Secur, 110 Gymnasium Pl, Saskatoon, SK S7N 0W9, Canada
[3] Univ Saskatchewan, Dept Math & Stat, McLean Hall, Saskatoon, SK S7N 5E6, Canada
关键词
GWAS; genome-wide association study; Arabidopsis thaliana; plant phenomics; plant genomics; PLINK; TASSEL; GAPIT; FaST-LMM; statistical power; input data quantity; epistasis; POWER; TOOL;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Many software packages have been developed for Genome-Wide Association Studies (GWAS) based on various statistical models. One key factor influencing the statistical reliability of GWAS is the amount of input data used. In this paper, we investigate how input data quantity influences output of four widely used GWAS programs, PLINK, TASSEL, GAPIT, and FaST-LMM, in the context of plant genomes and phenotypes. Both synthetic and real data are used. Evaluation is based on p- and q-values of output SNPs, and Kendall rank correlation between output SNP lists. Results show that for the same GWAS program, different Arabidopsis thaliana datasets demonstrate similar trends of rank correlation with varied input quantity, but differentiate on the numbers of SNPs passing a given p- or q-value threshold. We also show that variations in numbers of replicates influence the p-values of SNPs, but do not strongly affect the rank correlation.
引用
收藏
页码:19 / 43
页数:25
相关论文
共 50 条
  • [41] Predicting allergic diseases in children using genome-wide association study (GWAS) data and family history
    Park, Jaehyun
    Jang, Haerin
    Kim, Mina
    Hong, Jung Yeon
    Kim, Yoon Hee
    Sohn, Myung Hyun
    Park, Sang-Cheol
    Won, Sungho
    Kim, Kyung Won
    WORLD ALLERGY ORGANIZATION JOURNAL, 2021, 14 (05):
  • [42] Examining Barriers and Opportunities of Conducting Genome-Wide Association Studies in Developing Countries
    Dumancas, Gerard G.
    Rachal, Megan
    Zamora, Pia Regina Fatima C.
    de Castro, Romulo
    CURRENT EPIDEMIOLOGY REPORTS, 2022, 9 (04) : 376 - 386
  • [43] A short review on Genome-Wide Association Studies
    Cao, Xiaowen
    Xing, Li
    He, Hua
    Zhang, Xuekui
    BIOINFORMATION, 2020, 16 (05) : 393 - 395
  • [44] An Analysis Pipeline for Genome-wide Association Studies
    Stefanov, Stefan
    Lautenberger, James
    Gold, Bert
    CANCER INFORMATICS, 2008, 6 : 455 - +
  • [45] Statistical methods for genome-wide association studies
    Wang, Maggie Haitian
    Cordell, Heather J.
    Van Steen, Kristel
    SEMINARS IN CANCER BIOLOGY, 2019, 55 : 53 - 60
  • [46] Genome-wide association studies in pharmacogenomics of antidepressants
    Lin, Eugene
    Lane, Hsien-Yuan
    PHARMACOGENOMICS, 2015, 16 (05) : 555 - 566
  • [47] Genome-Wide Association Studies and Liver Disease
    Speliotes, Elizabeth K.
    SEMINARS IN LIVER DISEASE, 2015, 35 (04) : 355 - 360
  • [48] Concepts and relevance of genome-wide association studies
    Scherer, Andreas
    Christensen, G. Bryce
    SCIENCE PROGRESS, 2016, 99 (01) : 59 - 67
  • [49] GENOME-WIDE ASSOCIATION STUDIES OF CARDIOVASCULAR DISEASE
    Walsh, Roddy
    Jurgens, Sean J.
    Erdmann, Jeanette
    Bezzina, Connie R.
    PHYSIOLOGICAL REVIEWS, 2023, 103 (03) : 2039 - 2055
  • [50] Genome-wide Association Studies of Cancer Predisposition
    Stadler, Zsofia K.
    Vijai, Joseph
    Thom, Peter
    Kirchhoff, Tomas
    Hansen, Nichole A. L.
    Kauff, Noah D.
    Robson, Mark
    Offit, Kenneth
    HEMATOLOGY-ONCOLOGY CLINICS OF NORTH AMERICA, 2010, 24 (05) : 973 - +