Comparing Four Genome-Wide Association Study (GWAS) Programs with Varied Input Data Quantity

被引:0
|
作者
Yan, Yan [1 ]
Burbridge, Connor [1 ]
Shi, Jinhong [1 ]
Liu, Juxin [2 ]
Kusalik, Anthony [1 ]
机构
[1] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK, Canada
[2] Univ Saskatchewan, Dept Math & Stat, Saskatoon, SK, Canada
来源
PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM) | 2018年
关键词
Genome-Wide Association Study (GWAS); Arabidopsis thaliana; plant phenomics; plant genomics; PLINK; TASSEL; GAPIT; FaST-LMM; POWER;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Genome-wide association studies (GWAS) have served as primary methods for the past decade for identifying associations between genetic variants and traits or diseases. Many software packages have been developed for GWAS analysis based on different statistical models. One key factor influencing the statistical reliability of GWAS is the amount of input data used. Few studies have been conducted to investigate this effect by comparing the performance of GWAS programs using varied amounts of experimental data, especially in the context of plants and plant genomes. In this paper, we investigate how input data quantity influences output of four widely used GWAS programs, PLINK, TASSEL, GAPIT, and FaST-LMM. Both synthetic and real data are used. Standard GWAS output includes single nucleotide polymorphisms (SNPs) and their p-values. To evaluate the programs, p-values and q-values of SNPs, and Kendall rank correlation between output SNP lists, are used. Results show that with the same GWAS program, different Arabidopsis thaliana datasets demonstrate similar trends of rank correlation with varied input quantity, but differentiate on the numbers of SNPs passing a given p- or q-value threshold. In practice, experimental datasets may have samples containing varied numbers of biological replicates. We show that this variation in replicates influences the p-values of SNPs, but does not strongly affect the rank correlation. When comparing synthetic and real data, the output SNPs from synthetic data have similar rank correlation trends across all four GWAS programs, but the same measure from real data is diverse across the programs. In addition, the real data results in a linear-like increase in the numbers of significant SNPs with more input data, but the synthetic data does not follow this trend. This study provides guidance on selecting GWAS programs when varied experimental data is present and on selecting significant SNPs for subsequent study. It contributes to understanding how much input data is necessary to yield satisfying GWAS results.
引用
收藏
页码:1802 / 1809
页数:8
相关论文
共 50 条
  • [41] Towards practical privacy-preserving genome-wide association study
    Charlotte Bonte
    Eleftheria Makri
    Amin Ardeshirdavani
    Jaak Simm
    Yves Moreau
    Frederik Vercauteren
    BMC Bioinformatics, 19
  • [42] A genome-wide association study of DSM-IV cannabis dependence
    Agrawal, Arpana
    Lynskey, Michael T.
    Hinrichs, Anthony
    Grucza, Richard
    Saccone, Scott F.
    Krueger, Robert
    Neuman, Rosalind
    Howells, William
    Fisher, Sherri
    Fox, Louis
    Cloninger, Robert
    Dick, Danielle M.
    Doheny, Kimberly F.
    Edenberg, Howard J.
    Goate, Alison M.
    Hesselbrock, Victor
    Johnson, Eric
    Kramer, John
    Kuperman, Samuel
    Nurnberger, John I., Jr.
    Pugh, Elizabeth
    Schuckit, Marc
    Tischfield, Jay
    Rice, John P.
    Bucholz, Kathleen K.
    Bierut, Laura J.
    ADDICTION BIOLOGY, 2011, 16 (03) : 514 - 518
  • [43] Replication of a genome-wide association study of panic disorder in a Japanese population
    Takeshi Otowa
    Hisashi Tanii
    Nagisa Sugaya
    Eiji Yoshida
    Ken Inoue
    Shin Yasuda
    Takafumi Shimada
    Yoshiya Kawamura
    Mamoru Tochigi
    Takanobu Minato
    Tadashi Umekage
    Taku Miyagawa
    Nao Nishida
    Katsushi Tokunaga
    Yuji Okazaki
    Hisanobu Kaiya
    Tsukasa Sasaki
    Journal of Human Genetics, 2010, 55 : 91 - 96
  • [44] Genome-wide association study of growth traits in the Jinghai Yellow chicken
    Zhang, G. X.
    Fan, Q. C.
    Zhang, T.
    Wang, J. Y.
    Wang, W. H.
    Xue, Q.
    Wang, Y. J.
    GENETICS AND MOLECULAR RESEARCH, 2015, 14 (04) : 15331 - 15338
  • [45] A Multi-center Genome-wide Association Study of Cervical Dystonia
    Sun, Yan, V
    Li, Chengchen
    Hui, Qin
    Huang, Yunfeng
    Barbano, Richard
    Rodriguez, Ramon
    Malaty, Irene A.
    Reich, Stephen
    Bambarger, Kimberly
    Holmes, Katie
    Jankovic, Joseph
    Patel, Neepa J.
    Roze, Emmanuel
    Vidailhet, Marie
    Berman, Brian D.
    LeDoux, Mark S.
    Espay, Alberto J.
    Agarwal, Pinky
    Pirio-Richardson, Sarah
    Frank, Samuel A.
    Ondo, William G.
    Saunders-Pullman, Rachel
    Chouinard, Sylvain
    Natividad, Stover
    Berardelli, Alfredo
    Pantelyat, Alexander Y.
    Brashear, Allison
    Fox, Susan H.
    Kasten, Meike
    Kraemer, Ulrike M.
    Neis, Miriam
    Baeumer, Tobias
    Loens, Sebastian
    Borsche, Max
    Zittel, Simone
    Maurer, Antonia
    Gelderblom, Mathias
    Volkmann, Jens
    Odorfer, Thorsten
    Kuehn, Andrea A.
    Borngraeber, Friederike
    Koenig, Inke R.
    Cruchaga, Carlos
    Cotton, Adam C.
    Kilic-Berkmen, Gamze
    Freeman, Alan
    Factor, Stewart A.
    Scorr, Laura
    Bremner, J. Douglas
    Vaccarino, Viola
    MOVEMENT DISORDERS, 2021, : 2795 - 2801
  • [46] The Generation R study: a candidate gene study and genome-wide association study (GWAS) on health-related quality of life (HRQOL) of mothers and young children
    Raat, Hein
    van Rossem, Lenie
    Jaddoe, Vincent W. V.
    Landgraf, Jeanne M.
    Feeny, David
    Moll, Henriette A.
    Hofman, Albert
    Mackenbach, Johan P.
    QUALITY OF LIFE RESEARCH, 2010, 19 (10) : 1439 - 1446
  • [47] Genome-wide association study of degenerative bony changes of the temporomandibular joint
    Yamaguchi, T.
    Nakaoka, H.
    Yamamoto, K.
    Fujikawa, T.
    Kim, Y-I
    Yano, K.
    Haga, S.
    Katayama, K.
    Shibusawa, T.
    Park, S. B.
    Maki, K.
    Kimura, R.
    Inoue, I.
    ORAL DISEASES, 2014, 20 (04) : 409 - 415
  • [48] Genome-wide association study of direct oral anticoagulants and their relation to bleeding
    Attelind, Sofia
    Eriksson, Niclas
    Wadelius, Mia
    Hallberg, Par
    EUROPEAN JOURNAL OF CLINICAL PHARMACOLOGY, 2025, : 771 - 783
  • [49] Using "-omics" Data to Inform Genome-wide Association Studies (GWASs) in the Osteoporosis Field
    Abood, Abdullah
    Farber, Charles R.
    CURRENT OSTEOPOROSIS REPORTS, 2021, 19 (04) : 369 - 380
  • [50] Genome-wide association study for four measures of epigenetic age acceleration and two epigenetic surrogate markers using DNA methylation data from Taiwan Biobank
    Lin, Wan-Yu
    HUMAN MOLECULAR GENETICS, 2022, 31 (11) : 1860 - 1870