Genome-Wide Regression and Prediction with the BGLR Statistical Package

被引:1029
作者
Perez, Paulino [1 ]
de los Campos, Gustavo [2 ]
机构
[1] Colegio Postgrad, Mexico City 56230, DF, Mexico
[2] Univ Alabama Birmingham, Dept Biostat, Sect Stat Genet, Birmingham, AL 35294 USA
基金
美国国家卫生研究院;
关键词
Bayesian methods; regression; whole-genome regression; whole-genome prediction; genome-wide regression; variable selection; shrinkage; semiparametric regression; reproducing kernel Hilbert spaces regressions; RKHS; R; GenPred; shared data resource; DENSE MOLECULAR MARKERS; QUANTITATIVE TRAITS; GENETIC VALUES; ENABLED PREDICTION; COMPLEX TRAITS; R PACKAGE; SELECTION; MODELS; DISTRIBUTIONS; ASSOCIATION;
D O I
10.1534/genetics.114.164442
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Many modern genomic data analyses require implementing regressions where the number of parameters (p, e.g., the number of marker effects) exceeds sample size (n). Implementing these large-p-with-small-n regressions poses several statistical and computational challenges, some of which can be confronted using Bayesian methods. This approach allows integrating various parametric and nonparametric shrinkage and variable selection procedures in a unified and consistent manner. The BGLR R-package implements a large collection of Bayesian regression models, including parametric variable selection and shrinkage methods and semiparametric procedures (Bayesian reproducing kernel Hilbert spaces regressions, RKHS). The software was originally developed for genomic applications; however, the methods implemented are useful for many nongenomic applications as well. The response can be continuous (censored or not) or categorical (either binary or ordinal). The algorithm is based on a Gibbs sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines. In this article we describe the methods implemented in BGLR, present examples of the use of the package, and discuss practical issues emerging in real-data analysis.
引用
收藏
页码:483 / U63
页数:28
相关论文
共 47 条
  • [1] BAYESIAN-ANALYSIS OF BINARY AND POLYCHOTOMOUS RESPONSE DATA
    ALBERT, JH
    CHIB, S
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (422) : 669 - 679
  • [2] ANDREWS DF, 1974, J ROY STAT SOC B MET, V36, P99
  • [3] [Anonymous], PEDIGREEMM PEDIGREE
  • [4] Bellman R., 1961, Adaptive Control Processes: A Guided Tour, DOI DOI 10.1515/9781400874668
  • [5] EXPLAINING THE GIBBS SAMPLER
    CASELLA, G
    GEORGE, EI
    [J]. AMERICAN STATISTICIAN, 1992, 46 (03) : 167 - 174
  • [6] SPATIAL PREDICTION AND ORDINARY KRIGING
    CRESSIE, N
    [J]. MATHEMATICAL GEOLOGY, 1988, 20 (04): : 405 - 421
  • [7] Prediction of Genetic Values of Quantitative Traits in Plant Breeding Using Pedigree and Molecular Markers
    Crossa, Jose
    de los Campos, Gustavo
    Perez, Paulino
    Gianola, Daniel
    Burgueno, Juan
    Luis Araus, Jose
    Makumbi, Dan
    Singh, Ravi P.
    Dreisigacker, Susanne
    Yan, Jianbing
    Arief, Vivi
    Banziger, Marianne
    Braun, Hans-Joachim
    [J]. GENETICS, 2010, 186 (02) : 713 - U406
  • [8] Reproducing kernel Hilbert spaces regression: A general framework for genetic evaluation
    de los Campos, G.
    Gianola, D.
    Rosa, G. J. M.
    [J]. JOURNAL OF ANIMAL SCIENCE, 2009, 87 (06) : 1883 - 1887
  • [9] Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor
    de los Campos, Gustavo
    Vazquez, Ana I.
    Fernando, Rohan
    Klimentidis, Yann C.
    Sorensen, Daniel
    [J]. PLOS GENETICS, 2013, 9 (07):
  • [10] Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding
    de los Campos, Gustavo
    Hickey, John M.
    Pong-Wong, Ricardo
    Daetwyler, Hans D.
    Calus, Mario P. L.
    [J]. GENETICS, 2013, 193 (02) : 327 - +