Statistical data mining

被引:5
作者
Banks, David L. [1 ]
机构
[1] Duke Univ, Inst Stat & Decis Sci, Durham, NC 27708 USA
关键词
classification; clustering; machine learning; nonparametric regression; random forests; support vector machines;
D O I
10.1002/wics.53
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Data mining is widely used in modern science to extract signal from complex data sets. This article summarizes some of the key intellectual issues in the development of this field, largely from a historical perspective. There is particular emphasis on the Curse of Dimensionality, and its implications for non-parametric regression, classification, and cluster analysis. (C) 2009 John Wiley & Sons, Inc.
引用
收藏
页码:9 / 25
页数:17
相关论文
共 33 条
[1]   Comparing methods for multivariate nonparametric regression [J].
Banks, DL ;
Olszewski, RT ;
Maxion, RA .
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2003, 32 (02) :541-571
[2]   UNIVERSAL APPROXIMATION BOUNDS FOR SUPERPOSITIONS OF A SIGMOIDAL FUNCTION [J].
BARRON, AR .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) :930-945
[3]  
Bellman, 2015, ADAPTIVE CONTROL PRO
[4]  
Bishop C.M., 2008, PATTERN RECOGN
[5]  
Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]  
Breiman L., 2017, CLASSIFICATION REGRE
[8]   ROBUST LOCALLY WEIGHTED REGRESSION AND SMOOTHING SCATTERPLOTS [J].
CLEVELAND, WS .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1979, 74 (368) :829-836
[9]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[10]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38