BGWAS: Bayesian variable selection in linear mixed models with nonlocal priors for genome-wide association studies

被引:4
作者
Williams, Jacob [1 ]
Xu, Shuangshuang [1 ]
Ferreira, Marco A. R. [1 ]
机构
[1] Virginia Tech, Dept Stat, Blacksburg, VA 24061 USA
基金
美国国家科学基金会;
关键词
GWAS; Bayesian; Model selection;
D O I
10.1186/s12859-023-05316-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundGenome-wide association studies (GWAS) seek to identify single nucleotide polymorphisms (SNPs) that cause observed phenotypes. However, with highly correlated SNPs, correlated observations, and the number of SNPs being two orders of magnitude larger than the number of observations, GWAS procedures often suffer from high false positive rates.ResultsWe propose BGWAS, a novel Bayesian variable selection method based on nonlocal priors for linear mixed models specifically tailored for genome-wide association studies. Our proposed method BGWAS uses a novel nonlocal prior for linear mixed models (LMMs). BGWAS has two steps: screening and model selection. The screening step scans through all the SNPs fitting one LMM for each SNP and then uses Bayesian false discovery control to select a set of candidate SNPs. After that, a model selection step searches through the space of LMMs that may have any number of SNPs from the candidate set. A simulation study shows that, when compared to popular GWAS procedures, BGWAS greatly reduces false positives while maintaining the same ability to detect true positive SNPs. We show the utility and flexibility of BGWAS with two case studies: a case study on salt stress in plants, and a case study on alcohol use disorder.ConclusionsBGWAS maintains and in some cases increases the recall of true SNPs while drastically lowering the number of false positives compared to popular SMA procedures.
引用
收藏
页数:20
相关论文
共 31 条
[1]  
[Anonymous], 2014, R LANG ENV STAT COMP, V2014
[2]  
Begleiter H, 1995, ALCOHOL HEALTH RES W, V19, P228
[3]   The role of the BK channel in ethanol response behaviors: evidence from model organism and human studies [J].
Bettinger, Jill C. ;
Davies, Andrew G. .
FRONTIERS IN PHYSIOLOGY, 2014, 5
[4]   HMMSEQ: A HIDDEN MARKOV MODEL FOR DETECTING DIFFERENTIALLY EXPRESSED GENES FROM RNA-SEQ DATA [J].
Cui, Shiqi ;
Guha, Subharup ;
Ferreira, Marco A. R. ;
Tegge, Allison N. .
ANNALS OF APPLIED STATISTICS, 2015, 9 (02) :901-925
[5]   Extension of the bayesian alphabet for genomic selection [J].
Habier, David ;
Fernando, Rohan L. ;
Kizilkaya, Kadir ;
Garrick, Dorian J. .
BMC BIOINFORMATICS, 2011, 12
[6]   A variable selection method for genome-wide association studies [J].
He, Qianchuan ;
Lin, Dan-Yu .
BIOINFORMATICS, 2011, 27 (01) :1-8
[7]   Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel [J].
Horton, Matthew W. ;
Hancock, Angela M. ;
Huang, Yu S. ;
Toomajian, Christopher ;
Atwell, Susanna ;
Auton, Adam ;
Muliyati, N. Wayan ;
Platt, Alexander ;
Sperone, F. Gianluca ;
Vilhjalmsson, Bjarni J. ;
Nordborg, Magnus ;
Borevitz, Justin O. ;
Bergelson, Joy .
NATURE GENETICS, 2012, 44 (02) :212-216
[8]   OXS2 is Required for Salt Tolerance Mainly through Associating with Salt Inducible Genes, CA1 and Araport11, in Arabidopsis [J].
Jing, Ying ;
Shi, Lin ;
Li, Xin ;
Zheng, Han ;
Gao, Jianwei ;
Wang, Mei ;
He, Lilong ;
Zhang, Wei .
SCIENTIFIC REPORTS, 2019, 9 (1)
[9]   Bayesian Model Selection in High-Dimensional Settings [J].
Johnson, Valen E. ;
Rossell, David .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2012, 107 (498) :649-660
[10]   On the use of non-local prior densities in Bayesian hypothesis tests [J].
Johnson, Valen E. ;
Rossell, David .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2010, 72 :143-170