Sampling large databases for association rules

被引:0
|
作者
Toivonen, H
机构
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Discovery of association rules is an important database mining problem. Current algorithms for finding association rules require several passes over the analyzed database, and obviously the role of I/O overhead is very significant for very large databases. We present new algorithms that reduce the database activity considerably. The idea is to pick a random sample, to find using this sample all association rules that probably hold in the whole database, and then to verify the results with the rest of the database. The algorithms thus produce exact association rules, not approximations based on a sample. The approach is, however, probabilistic, and in those rare cases where our sampling method does not produce all association rules, the missing rules can be found in a second pass. Our experiments show that the proposed algorithms can find association rules very efficiently in only one database pass.
引用
收藏
页码:134 / 145
页数:12
相关论文
共 50 条
  • [41] Algorithms for mining association rules in bag databases
    Hsu, PY
    Chen, YL
    Ling, CC
    INFORMATION SCIENCES, 2004, 166 (1-4) : 31 - 47
  • [42] Efficient mining of association rules in distributed databases
    Cheung, DW
    Ng, VT
    Fu, AW
    Fu, YJ
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (06) : 911 - 922
  • [43] Mining gene expression databases for association rules
    Creighton, C
    Hanash, S
    BIOINFORMATICS, 2003, 19 (01) : 79 - 86
  • [44] Mining association rules from biological databases
    Rodríguez, A
    Carazo, JM
    Trelles, O
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2005, 56 (05): : 493 - 504
  • [45] Meta-Association Rules for Fusing Regular Association Rules from Different Databases
    Dolores Ruiz, M.
    Gomez-Romero, Juan
    Martin-Bautista, Maria J.
    Sanchez, Daniel
    Delgado, Miguel
    2014 17TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2014,
  • [46] Algorithm for Finding Association Rules in Distributed Databases
    Bhatnagar, Surbhi
    2012 2ND IEEE INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2012, : 915 - 920
  • [47] Mining fuzzy association rules in incomplete databases
    Arotaritei, D
    PROCEEDINGS OF THE 2002 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOL 1 & 2, 2002, : 267 - 271
  • [48] Mining spatial association rules in image databases
    Lee, Anthony J. T.
    Hong, Ruey-Wen
    Ko, Wei-Min
    Tsao, Wen-Kwang
    Lin, Hsiu-Hui
    INFORMATION SCIENCES, 2007, 177 (07) : 1593 - 1608
  • [49] Algorithms for mining association rules in image databases
    Gao, Li
    Dai, Shangping
    Zhu, Changwu
    Zheng, Shijue
    DCABES 2007 Proceedings, Vols I and II, 2007, : 805 - 807
  • [50] Mining interesting association rules from customer databases and transaction databases
    Tsai, PSM
    Chen, CM
    INFORMATION SYSTEMS, 2004, 29 (08) : 685 - 696