Risk estimation and risk prediction using machine-learning methods

被引:0
作者
Jochen Kruppa
Andreas Ziegler
Inke R. König
机构
[1] Universität zu Lübeck,Institut für Medizininsche Biometrie und Statistik
[2] Universitätsklinikum Schleswig-Holstein,undefined
[3] Campus Lübeck,undefined
来源
Human Genetics | 2012年 / 131卷
关键词
Lasso; Probability Estimation; Multifactor Dimensionality Reduction; Brier Score; Single Single Nucleotide Polymorphism;
D O I
暂无
中图分类号
学科分类号
摘要
After an association between genetic variants and a phenotype has been established, further study goals comprise the classification of patients according to disease risk or the estimation of disease probability. To accomplish this, different statistical methods are required, and specifically machine-learning approaches may offer advantages over classical techniques. In this paper, we describe methods for the construction and evaluation of classification and probability estimation rules. We review the use of machine-learning approaches in this context and explain some of the machine-learning algorithms in detail. Finally, we illustrate the methodology through application to a genome-wide association analysis on rheumatoid arthritis.
引用
收藏
页码:1639 / 1654
页数:15
相关论文
共 289 条
[1]  
Amos CI(2009)Data for Genetic Analysis Workshop 16 Problem 1, association analysis of rheumatoid arthritis data BMC Proc 3 S2-35
[2]  
Chen WV(1972)Separate sample logistic discrimination Biometrika 59 19-1616
[3]  
Seldin MF(2009)Predictive modeling in case–control single-nucleotide polymorphism studies in the presence of population stratification: a case study using Genetic Analysis Workshop 16 Problem 1 dataset BMC Proc 3 S60-139
[4]  
Remmers EF(2012)Identifying representative trees from ensembles Stat Med 31 1601-2518
[5]  
Taylor KE(1999)An empirical comparison of voting classification algorithms: bagging, boosting, and variants Mach Learn 36 105-2057
[6]  
Criswell LA(2009)A fast algorithm for genome-wide haplotype pattern mining BMC Bioinformatics 10 S74-1006
[7]  
Lee AT(2010)On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification J Multivariate Anal 101 2499-140
[8]  
Plenge RM(2008)Consistency of random forests and other averaging classifiers J Mach Learn Res 9 2039-32
[9]  
Kastner DL(2008)Sampling uncertainty and confidence intervals for the Brier score and Brier skill score Weather Forecast 23 992-2368
[10]  
Gregersen PK(1996)Bagging predictors Mach Learn 24 123-852