Toward intelligent assistance for a data mining process: An ontology-based approach for cost-sensitive classification

被引:87
作者
Bernstein, A
Provost, F
Hill, S
机构
[1] Univ Zurich, Dept Informat, CH-8057 Zurich, Switzerland
[2] NYU, Stern Sch Business, New York, NY 10012 USA
关键词
cost-sensitive learning; data mining; data mining process; intelligent assistants; knowledge discovery; knowledge; discovery process; machine learning; metalearning;
D O I
10.1109/TKDE.2005.67
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A data mining (DM) process involves multiple stages. A simple, but typical, process might include preprocessing data, applying a data mining algorithm, and postprocessing the mining results. There are many possible choices for each stage, and only some combinations are valid. Because of the large space and nontrivial interactions, both novices and data mining specialists need assistance in composing and selecting DM processes. Extending notions developed for statistical expert systems we present a prototype Intelligent Discovery Assistant (IDA), which provides users with 1) systematic enumerations of valid DM processes, in order that important, potentially fruitful options are not overlooked, and 2) effective rankings of these valid processes by different criteria, to facilitate the choice of DM processes to execute. We use the prototype to show that an IDA can indeed provide useful enumerations and effective rankings in the context of simple classification processes. We discuss how an IDA could be an important tool for knowledge sharing among a team of data miners. Finally, we illustrate the claims with a demonstration of cost-sensitive classification using a more complicated process and data from the 1998 KDDCUP competition.
引用
收藏
页码:503 / 518
页数:16
相关论文
共 50 条
  • [11] An intelligent model for early kick detection based on cost-sensitive learning
    Peng, Chi
    Li, Qingfeng
    Fu, Jianhong
    Yang, Yun
    Zhang, Xiaomin
    Su, Yu
    Xu, Zhaoyang
    Zhong, Chengxu
    Wu, Pengcheng
    PROCESS SAFETY AND ENVIRONMENTAL PROTECTION, 2023, 169 : 398 - 417
  • [12] Ontology-based data mining approach implemented on exploring product and brand spectrum
    Liao, Shu-hsien
    Ho, Hsu-hui
    Yang, Feng-chich
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (09) : 11730 - 11744
  • [13] A Cost-Sensitive Deep Learning-Based Approach for Network Traffic Classification
    Telikani, Akbar
    Gandomi, Amir H.
    Choo, Kim-Kwang Raymond
    Shen, Jun
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2022, 19 (01): : 661 - 670
  • [14] Data mining and ontology-based techniques in healthcare management
    Mahmoud, Hassan
    Abbas, Enas
    Fathy, Ibrahim
    INTERNATIONAL JOURNAL OF INTELLIGENT ENGINEERING INFORMATICS, 2018, 6 (06) : 509 - 526
  • [15] Angle-based cost-sensitive multicategory classification
    Yang, Yi
    Guo, Yuxuan
    Chang, Xiangyu
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2021, 156
  • [16] Ontology-based Data Warehouse Development Process
    Vranesic, Helena
    Rovan, Lidia
    PROCEEDINGS OF THE ITI 2009 31ST INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES, 2009, : 205 - +
  • [17] Cost-Sensitive Variational Autoencoding Classifier for Imbalanced Data Classification
    Liu, Fen
    Qian, Quan
    ALGORITHMS, 2022, 15 (05)
  • [18] Ensemble cost-sensitive hypernetwork models for imbalanced data classification
    Sun, Kaiwei, 1600, Binary Information Press (10): : 10531 - 10541
  • [19] Ontology-based approach for the provision of simulation knowledge acquired by Data and Text Mining processes
    Kestel, Philipp
    Kuegler, Patricia
    Zirngibl, Christoph
    Schleich, Benjamin
    Wartzack, Sandro
    ADVANCED ENGINEERING INFORMATICS, 2019, 39 : 292 - 305
  • [20] Data Mining to Classify Fog Events by applying Cost-Sensitive Classifier
    Zazzaro, Gaetano
    Pisano, Francesca Maria
    Mercogliano, Paola
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS (CISIS 2010), 2010, : 1093 - 1098