A Knowledge Discovery System with Support for Model Selection and Visualization

被引:0
作者
Tu Bao Ho
Trong Dung Nguyen
Hiroshi Shimodaira
Masayuki Kimura
机构
[1] Japan Advanced Institute of Science and Technology,
来源
Applied Intelligence | 2003年 / 19卷
关键词
KDD; model selection; visualization; user's participation;
D O I
暂无
中图分类号
学科分类号
摘要
The process of knowledge discovery in databases consists of several steps that are iterative and interactive. In each application, to go through this process the user has to exploit different algorithms and their settings that usually yield multiple models. Model selection, that is, the selection of appropriate models or algorithms to achieve such models, requires meta-knowledge of algorithm/model and model performance metrics. Therefore, model selection is usually a difficult task for the user. We believe that simplifying the process of model selection for the user is crucial to the success of real-life knowledge discovery activities. As opposed to most related work that aims to automate model selection, in our view model selection is a semiautomatic process, requiring an effective collaboration between the user and the discovery system. For such a collaboration, our solution is to give the user the ability to try various alternatives and to compare competing models quantitatively by performance metrics, and qualitatively by effective visualization. This paper presents our research on model selection and visualization in the development of a knowledge discovery system called D2MS. The paper addresses the motivation of model selection in knowledge discovery and related work, gives an overview of D2MS, and describes its solution to model selection and visualization. It then presents the usefulness of D2MS model selection in two case studies of discovering medical knowledge in hospital data—on meningitis and stomach cancer—using three data mining methods of decision trees, conceptual clustering, and rule induction.
引用
收藏
页码:125 / 141
页数:16
相关论文
共 24 条
  • [1] Nguyen T.D.(1999)An interactive graphic system for decision tree induction Journal of Japanese Society for Artificial Intelligence 14 131-138
  • [2] Ho T.B.(1997)Discovering and using knowledge from unsupervised data Decision Support Systems 21 27-41
  • [3] Ho T.B.(1997)Browsing hierarchical data with multi-level dynamic queries and pruning Inter. Journal of Human-Computer Studies 46 103-124
  • [4] Kumar H.P.(2000)An introduction to model selection Journal of Mathematical Psychology 44 41-61
  • [5] Plaisant C.(2000)Key concepts in model selection: Performance and generalizability Journal of Mathematical Psychology 44 205-231
  • [6] Shneiderman B.(1995)Recursive automatic bias selection for classifier construction Machine Learning 20 63-94
  • [7] Zucchini W.(2000)Building algorithm profiles for prior model selection in knowledge discovery systems Engineering Intelligent Systems 8 77-87
  • [8] Forster M.R.(1999)NOEMON: Design, implementation and performance results for an intelligent assistant for classifier selection Intelligent Data Analysis Journal 3 319-337
  • [9] Brodley C.E.(1997)The hyperbolic browser: A focus + context techniques for visualizing large hierarchies Journal of Visual Languages and Computing 7 33-35
  • [10] Hilario M.(1997)Data mining using MLC++, a machine learning library in C++ International Journal of Artificial Intelligence Tools 6 537-566