A review of software packages for data mining

被引:21
作者
Haughton, D
Deichmann, J
Eshghi, A
Sayek, S
Teebagy, N
Topi, H
机构
[1] Bentley Coll, Data Anal Res Team, Waltham, MA 02452 USA
[2] Bilkent Univ, Bilkent, Turkey
关键词
Clementine; Ghostminer; Quadstone; SAS; Enterprise Miner; XLMiner;
D O I
10.1198/0003130032486
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We present to the statistical community an overview of five data mining packages with the intent of leaving the reader with a sense of the different capabilities, the ease or difficulty of use, and the user interface of each package. We are not attempting to perform a controlled comparison of the algorithms in each package to decide which has the strongest predictive power, but instead hope to give an idea of the approach to predictive modeling used in each of them. The packages are compared in the areas of descriptive statistics and graphics, predictive models, and association (market basket) analysis. As expected, the packages affiliated with the most popular statistical software packages (SAS and SPSS) provide the broadest range of features with remarkably similar modeling and interface approaches, whereas the other packages all have their special sets of features and specific target audiences whom we believe each of the packages will serve well. It is essential that an organization considering the purchase of a data mining package carefully evaluate the available options and choose the one that provides the best fit with its particular needs.
引用
收藏
页码:290 / 309
页数:20
相关论文
共 3 条
[1]  
Breiman L., 1984, BIOMETRICS, DOI DOI 10.2307/2530946
[2]  
Deichmann J., 2002, Journal of Interactive Marketing, V16, P15, DOI DOI 10.1002/DIR.10040
[3]  
Quinlan J.R., 1993, C4.5 : programs for machine learning