Toward intelligent assistance for a data mining process: An ontology-based approach for cost-sensitive classification

被引:87
|
作者
Bernstein, A
Provost, F
Hill, S
机构
[1] Univ Zurich, Dept Informat, CH-8057 Zurich, Switzerland
[2] NYU, Stern Sch Business, New York, NY 10012 USA
关键词
cost-sensitive learning; data mining; data mining process; intelligent assistants; knowledge discovery; knowledge; discovery process; machine learning; metalearning;
D O I
10.1109/TKDE.2005.67
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A data mining (DM) process involves multiple stages. A simple, but typical, process might include preprocessing data, applying a data mining algorithm, and postprocessing the mining results. There are many possible choices for each stage, and only some combinations are valid. Because of the large space and nontrivial interactions, both novices and data mining specialists need assistance in composing and selecting DM processes. Extending notions developed for statistical expert systems we present a prototype Intelligent Discovery Assistant (IDA), which provides users with 1) systematic enumerations of valid DM processes, in order that important, potentially fruitful options are not overlooked, and 2) effective rankings of these valid processes by different criteria, to facilitate the choice of DM processes to execute. We use the prototype to show that an IDA can indeed provide useful enumerations and effective rankings in the context of simple classification processes. We discuss how an IDA could be an important tool for knowledge sharing among a team of data miners. Finally, we illustrate the claims with a demonstration of cost-sensitive classification using a more complicated process and data from the 1998 KDDCUP competition.
引用
收藏
页码:503 / 518
页数:16
相关论文
共 50 条
  • [1] COST-SENSITIVE SPFCNN MINER FOR CLASSIFICATION OF IMBALANCED DATA
    Zhao, Linchang
    Shang, Zhaowei
    Zhao, Ling
    Wei, Yu
    Tang, Yuan Yan
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2019, : 51 - 57
  • [2] Ontology-based data mining approach implemented for sport marketing
    Liao, Shu-Hsien
    Chen, Jen-Lung
    Hsu, Tze-Yuan
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (08) : 11045 - 11056
  • [3] Ontology-Based Data Mining Workflow Construction
    Man Tianxing
    Lebedev, Sergey
    Vodyaho, Alexander
    Zhukova, Nataly
    Shichkina, Yulia A.
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2021, PT VIII, 2021, 12956 : 417 - 431
  • [4] A cost-sensitive approach to feature selection in micro-array data classification
    Bosin, Andrea
    Dessi, Nicoletta
    Pes, Barbara
    APPLICATIONS OF FUZZY SETS THEORY, 2007, 4578 : 571 - +
  • [5] Cost Sensitive Classification in Data Mining
    Qin, Zhenxing
    Zhang, Chengqi
    Wang, Tao
    Zhang, Shichao
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2010, PT I, 2010, 6440 : 1 - 11
  • [6] Cost-sensitive classification with inadequate labeled data
    Wang, Tao
    Qin, Zhenxing
    Zhang, Shichao
    Zhang, Chengqi
    INFORMATION SYSTEMS, 2012, 37 (05) : 508 - 516
  • [7] Cost-sensitive boosting for classification of imbalanced data
    Sun, Yamnin
    Kamel, Mohamed S.
    Wong, Andrew K. C.
    Wang, Yang
    PATTERN RECOGNITION, 2007, 40 (12) : 3358 - 3378
  • [8] Classification of Kidney Cancer Data Using Cost-Sensitive Hybrid Deep Learning Approach
    Shon, Ho Sun
    Batbaatar, Erdenebileg
    Kim, Kyoung Ok
    Cha, Eun Jong
    Kim, Kyung-Ah
    SYMMETRY-BASEL, 2020, 12 (01):
  • [9] Toward intelligent data warehouse mining: An ontology-integrated approach for multi-dimensional association mining
    Wu, Chin-Ang
    Lin, Wen-Yang
    Jiang, Chang-Long
    Wu, Chuan-Chun
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (09) : 11011 - 11023
  • [10] A Cost-Sensitive Based Approach for Improving Associative Classification on Imbalanced Datasets
    Waiyamai, Kitsana
    Suwannarattaphoom, Phoonperm
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, MLDM 2014, 2014, 8556 : 31 - 42