Linear correlation discovery in databases: a data mining approach

被引:11
作者
Chiang, RHL
Cecil, CEH
Lim, EP
机构
[1] Univ Cincinnati, Coll Business, Dept Informat Syst, Cincinnati, OH 45221 USA
[2] Nanyang Technol Univ, Nanyang Business Sch, Singapore 639798, Singapore
[3] Nanyang Technol Univ, Sch Comp Engn, Ctr Adv Informat Syst, Singapore 639798, Singapore
关键词
knowledge discovery in database; linear correlation; association measurement; data mining;
D O I
10.1016/j.datak.2004.09.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Very little research in knowledge discovery has studied how to incorporate statistical methods to automate linear correlation discovery (LCD). We present an automatic LCD methodology that adopts statistical measurement functions to discover correlations from databases' attributes. Our methodology automatically pairs attribute groups having potential linear correlations, measures the linear correlation of each pair of attribute groups, and confirms the discovered correlation. The methodology is evaluated in two sets of experiments. The results demonstrate the methodology's ability to facilitate linear correlation discovery for databases with a large amount of data. (c) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:311 / 337
页数:27
相关论文
共 50 条
  • [41] Data mining in deductive databases using query flocks
    Toroslu, IH
    Yetisgen-Yildiz, M
    EXPERT SYSTEMS WITH APPLICATIONS, 2005, 28 (03) : 395 - 407
  • [42] DATA MINING FROM DATABASES USING NEURAL NETWORKS
    Matusik, Petr
    Bures, Antonin
    EIGHTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS, 2010, : 55 - 59
  • [43] Set-oriented data mining in relational databases
    Houtsma, M
    Swami, A
    DATA & KNOWLEDGE ENGINEERING, 1995, 17 (03) : 245 - 262
  • [44] Data Mining of Bibliometric Data on Web based Databases of Electronic Journals
    Levitt, Jonathan M.
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 11, 2006, 11 : 37 - 38
  • [45] Mining maximal frequent patterns in transactional databases and dynamic data streams: A spark-based approach
    Karim, Md. Rezaul
    Cochez, Michael
    Beyan, Oya Deniz
    Ahmed, Chowdhury Farhan
    Decker, Stefan
    INFORMATION SCIENCES, 2018, 432 : 278 - 300
  • [46] A semiautomated approach to gene discovery through expressed sequence tag data mining: Discovery of new human transporter genes
    Shoshana Brown
    Jean l. Chang
    Wolfgang Sadee
    Patricia C. Babbitt
    AAPS PharmSci, 5
  • [47] A semiautomated approach to gene discovery through expressed sequence tag data mining: Discovery of new human transporter genes
    Brown, S
    Chang, JL
    Sadee, W
    Babbitt, PC
    AAPS PHARMSCI, 2003, 5 (01):
  • [49] Asymmetric threat data mining and knowledge discovery
    Gilmore, J
    Pagels, M
    Palk, J
    DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS AND TECHNOLOGY III, 2001, 4384 : 218 - 228
  • [50] Granular correlation analysis in data mining
    Pedrycz, W
    Smith, MH
    18TH INTERNATIONAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS, 1999, : 715 - 719