Data mining tools

被引:77
作者
Mikut, Ralf [1 ]
Reischl, Markus [1 ]
机构
[1] Karlsruhe Inst Technol, D-76344 Eggenstein Leopoldshafen, Germany
关键词
KNOWLEDGE DISCOVERY; SOFTWARE;
D O I
10.1002/widm.24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The development and application of data mining algorithms requires the use of powerful software tools. As the number of available tools continues to grow, the choice of the most suitable tool becomes increasingly difficult. This paper attempts to support the decision-making process by discussing the historical development and presenting a range of existing state-of-the-art data mining and related tools. Furthermore, we propose criteria for the tool categorization based on different user groups, data structures, data mining tasks and methods, visualization and interaction styles, import and export options for data and models, platforms, and license policies. These criteria are then used to classify data mining tools into nine different types. The typical characteristics of these types are explained and a selection of the most important tools is categorized. This paper is organized as follows: the first section Historical Development and State-of-the-Art highlights the historical development of data mining software until present; the criteria to compare data mining software are explained in the second section Criteria for Comparing Data Mining Software. The last section Categorization of Data Mining Software into Different Types proposes a categorization of data mining software and introduces typical software tools for the different types. (C)0 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 431-443 DOI:10.1002/widm.24
引用
收藏
页码:431 / 443
页数:13
相关论文
共 37 条
  • [1] KEEL: a software tool to assess evolutionary algorithms for data mining problems
    Alcala-Fdez, J.
    Sanchez, L.
    Garcia, S.
    del Jesus, M. J.
    Ventura, S.
    Garrell, J. M.
    Otero, J.
    Romero, C.
    Bacardit, J.
    Rivas, V. M.
    Fernandez, J. C.
    Herrera, F.
    [J]. SOFT COMPUTING, 2009, 13 (03) : 307 - 318
  • [2] Knowledge discovery standards
    Anand, Sarabjot Singh
    Grobelnik, Marko
    Herrmann, Frank
    Hornick, Mark
    Lingenfelder, Christoph
    Rooney, Niall
    Wettschereck, Dietrich
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2007, 27 (01) : 21 - 56
  • [3] [Anonymous], 2007, An introduction to chemoinformatics
  • [4] [Anonymous], ENCY INFORM TECHNOLO
  • [5] [Anonymous], P 19 GMA GI WORKSH C
  • [6] NCBI GEO: mining tens of millions of expression profiles - database and tools update
    Barrett, Tanya
    Troup, Dennis B.
    Wilhite, Stephen E.
    Ledoux, Pierre
    Rudnev, Dmitry
    Evangelista, Carlos
    Kim, Irene F.
    Soboleva, Alexandra
    Tomashevsky, Maxim
    Edgar, Ron
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : D760 - D765
  • [7] Bitterer A, 2009, G00171189 GARTN RAS
  • [8] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [9] Distributed data mining on grids: Services, tools, and applications
    Cannataro, M
    Congiusta, A
    Pugliese, A
    Talia, D
    Trunfio, P
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2004, 34 (06): : 2451 - 2465
  • [10] Graph mining: Laws, generators, and algorithms
    Chakrabarti, Deepayan
    Faloutsos, Christos
    [J]. ACM COMPUTING SURVEYS, 2006, 38 (01) : A1 - A69