Statistical Selection of Biological Models for Genome-Wide Association Analyses

被引:0
|
作者
Bi, Wenjian [1 ]
Kang, Guolian [1 ]
Pounds, Stanley B. [1 ]
机构
[1] St Jude Childrens Res Hosp, Dept Biostat, 332 N Lauderdale St, Memphis, TN 38105 USA
基金
美国国家卫生研究院;
关键词
biological models; genome-wide association study; multiple adjusted evidence weights; two-stage discovery validation study; FALSE DISCOVERY RATES; FETAL-HEMOGLOBIN; P-VALUES; IDENTIFICATION; MICROARRAY; PHENOTYPE;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Genome-wide association studies have discovered many biologically important associations of genes with phenotypes. Typically, genome-wide association analyses formally test the association of each genetic feature (SNP, CNV, etc) with the phenotype of interest and summarize the results with multiplicity-adjusted p-values. However, very small p-values only provide evidence against the null hypothesis of no association without indicating which biological model best explains the observed data. Correctly identifying a specific biological model may improve the scientific interpretation and can be used to more effectively select and design a follow-up validation study. Thus, statistical methodology to identify the correct biological model for a particular genotype-phenotype association can be very useful to investigators. Here, we propose a general statistical method to summarize how accurately each of five biological models (null, additive, dominant, recessive, co-dominant) represents the data observed for each variant in a GWAS study. We show that the new method stringently controls the false discovery rate and asymptotically selects the correct biological model. Simulations of two-stage discovery-validation studies show that the new method has these properties and that its validation power is similar to or exceeds that of simple methods that use the same statistical model for all SNPs. Example analyses of three data sets also highlight these advantages of the new method. An R package is freely available at www.stjuderesearch.org/site/depts/biostats/software.
引用
收藏
页码:150 / 157
页数:8
相关论文
共 50 条
  • [1] Statistical selection of biological models for genome-wide association analyses
    Bi, Wenjian
    Kang, Guolian
    Pounds, Stanley B.
    METHODS, 2018, 145 : 67 - 75
  • [2] Selection of important variables by statistical learning in genome-wide association analysis
    Wei (Will) Yang
    C Charles Gu
    BMC Proceedings, 3 (Suppl 7)
  • [3] Statistical Power of Model Selection Strategies for Genome-Wide Association Studies
    Wu, Zheyang
    Zhao, Hongyu
    PLOS GENETICS, 2009, 5 (07):
  • [4] Statistical methods adopted in genome-wide association study and genomic selection
    Hayashi, Takeshi
    GENES & GENETIC SYSTEMS, 2011, 86 (06) : 393 - 393
  • [5] Stability Selection for Genome-Wide Association
    Alexander, David H.
    Lange, Kenneth
    GENETIC EPIDEMIOLOGY, 2011, 35 (07) : 722 - 728
  • [6] Functional models in genome-wide selection
    Moura, Ernandes Guedes
    Pamplona, Andrezza Kellen Alves
    Balestre, Marcio
    PLOS ONE, 2019, 14 (10):
  • [7] Statistical Approaches to Genome-wide Biological Networks
    Do, Jin Hwan
    Miyano, Satoru
    Choi, Dong-Kug
    BIOCHIP JOURNAL, 2009, 3 (03) : 190 - 202
  • [8] Genome-wide association analyses of expression phenotypes
    Chen, Gary K.
    Zheng, Tian
    Witte, John S.
    Goode, Ellen L.
    GENETIC EPIDEMIOLOGY, 2007, 31 : S7 - S11
  • [9] Statistical methods for genome-wide association studies
    Wang, Maggie Haitian
    Cordell, Heather J.
    Van Steen, Kristel
    SEMINARS IN CANCER BIOLOGY, 2019, 55 : 53 - 60
  • [10] Statistical Methods in Genome-Wide Association Studies
    Sun, Ning
    Zhao, Hongyu
    ANNUAL REVIEW OF BIOMEDICAL DATA SCIENCE, VOL 3, 2020, 2020, 3 : 265 - 288