DAIRRy-BLUP: A High-Performance Computing Approach to Genomic Prediction

被引：6

作者：

De Coninck, Arne ^{[1
]}

Fostier, Jan ^{[2
,3
]}

Maenhout, Steven ^{[4
]}

De Baets, Bernard ^{[1
]}

机构：

[1] Univ Ghent, Res Unit Knowledge Based Syst KERMIT, Dept Math Modelling Stat & Bioinformat, B-9000 Ghent, Belgium

[2] Ghent Univ IMinds, IBCN, B-9000 Ghent, Belgium

[3] Ghent Univ IMinds, Serv Res Unit, Dept Informat Technol, B-9000 Ghent, Belgium

[4] Progeno, B-9052 Zwijnaarde, Belgium

来源：

GENETICS | 2014年 / 197卷 / 03期

关键词：

RIDGE-REGRESSION; SELECTION; INFORMATION; GENETICS; SIMULATION; ALGORITHM;

D O I：

10.1534/genetics.114.163683

中图分类号：

Q3 [遗传学];

学科分类号：

071007 ; 090102 ;

摘要：

In genomic prediction, common analysis methods rely on a linear mixed-model framework to estimate SNP marker effects and breeding values of animals or plants. Ridge regression-best linear unbiased prediction (RR-BLUP) is based on the assumptions that SNP marker effects are normally distributed, are uncorrelated, and have equal variances. We propose DAIRRy-BLUP, a parallel, Distributed-memory RR-BLUP implementation, based on single-trait observations (y), that uses the Average Information algorithm for restricted maximum-likelihood estimation of the variance components. The goal of DAIRRy-BLUP is to enable the analysis of large-scale data sets to provide more accurate estimates of marker effects and breeding values. A distributed-memory framework is required since the dimensionality of the problem, determined by the number of SNP markers, can become too large to be analyzed by a single computing node. Initial results show that DAIRRy-BLUP enables the analysis of very large-scale data sets (up to 1,000,000 individuals and 360,000 SNPs) and indicate that increasing the number of phenotypic and genotypic records has a more significant effect on the prediction accuracy than increasing the density of SNP arrays.

引用

页码：813 / +

页数：12

共 32 条

[1]

Blackford L., 1997, ScaLAPACK Users Guide

[2] Fast and flexible simulation of DNA sequence data [J].