Fifty Years of Classification and Regression Trees

被引:418
作者
Loh, Wei-Yin [1 ]
机构
[1] Univ Wisconsin, Dept Stat, Madison, WI 53706 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Classification trees; regression trees; machine learning; prediction; MULTIVARIATE REGRESSION; VARIABLE IMPORTANCE; SPLIT SELECTION; SURVIVAL TREES; DECISION TREE; LOGISTIC-REGRESSION; RECURSIVE PARTITION; ENSEMBLE METHODS; MODEL; BIAS;
D O I
10.1111/insr.12016
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Fifty years have passed since the publication of the first regression tree algorithm. New techniques have added capabilities that far surpass those of the early methods. Modern classification trees can partition the data with linear splits on subsets of variables and fit nearest neighbor, kernel density, and other models in the partitions. Regression trees can fit almost every kind of traditional statistical model, including least-squares, quantile, logistic, Poisson, and proportional hazards models, as well as models for longitudinal and multiresponse data. Greater availability and affordability of software (much of which is free) have played a significant role in helping the techniques gain acceptance and popularity in the broader scientific community. This article surveys the developments and briefly reviews the key ideas behind some of the major algorithms.
引用
收藏
页码:329 / 348
页数:20
相关论文
共 146 条
[1]   NONPARAMETRIC INFERENCE FOR A FAMILY OF COUNTING PROCESSES [J].
AALEN, O .
ANNALS OF STATISTICS, 1978, 6 (04) :701-726
[2]   Binary partitioning for continuous longitudinal data: categorizing a prognostic variable [J].
Abdolell, M ;
LeBlanc, M ;
Stephens, D ;
Harrison, RV .
STATISTICS IN MEDICINE, 2002, 21 (22) :3395-3409
[3]   Tree-structured logistic model for over-dispersed binomial data with application to modeling developmental effects [J].
Ahn, H ;
Chen, JJ .
BIOMETRICS, 1997, 53 (02) :435-455
[4]   TREE-STRUCTURED EXPONENTIAL REGRESSION MODELING [J].
AHN, H .
BIOMETRICAL JOURNAL, 1994, 36 (01) :43-61
[5]   Log-gamma regression modeling through regression trees [J].
Ahn, HS .
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1996, 25 (02) :295-311
[6]   TREE-STRUCTURED PROPORTIONAL HAZARDS REGRESSION MODELING [J].
AHN, HS ;
LOH, WY .
BIOMETRICS, 1994, 50 (02) :471-485
[7]   Log-normal regression modeling through recursive partitioning [J].
Ahn, HS .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1996, 21 (04) :381-398
[8]   TREE-STRUCTURED EXTREME-VALUE MODEL REGRESSION [J].
AHN, HS .
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1994, 23 (01) :153-174
[9]   Knowledge discovery in data streams with regression tree methods [J].
Alberg, Dima ;
Last, Mark ;
Kandel, Abraham .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2012, 2 (01) :69-78
[10]  
[Anonymous], 1996, J COMPUT GRAPH STAT, DOI DOI 10.1080/10618600.1996.10474702