Mining multi-relational high utility itemsets from star schemas

被引:8
作者
Song, Wei [1 ,2 ]
Jiang, Beisi [1 ]
Qiao, Yangyang [1 ]
机构
[1] North China Univ Technol, Sch Comp Sci & Technol, Beijing 100144, Peoples R China
[2] Beijing Key Lab Integrat & Anal Large Scale Strea, Beijing 100144, Peoples R China
基金
北京市自然科学基金;
关键词
Multi-relational high utility itemsets; star schema; item index; transaction index; dimensional tree; relational tree; FREQUENT ITEMSETS; ALGORITHM; DATABASES; PATTERNS; INDEX;
D O I
10.3233/IDA-163231
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining high utility itemsets is an interesting research problem in data mining and knowledge discovery. Most high utility itemset discovery algorithms seek patterns in a single table, but few are dedicated to processing data stored using a multi-dimensional model. In this paper, the problem of mining high utility itemsets in multi-relational databases is investigated, and two algorithms, RHUI-Mine and RHUI-Growth, are proposed for star schema-based data warehouses. In the RHUI-Mine algorithm, the search space is traversed in a level-wise manner, and an item index and transaction index are proposed to represent item and transaction information, respectively. The RHUI-Growth algorithm traverses the search space recursively using a pattern growth approach, and a dimensional tree and relational tree are used to compress the original data. Neither algorithm materializes the join operation between tables, thus making use of the star schema properties. Experiments show that both RHUI-Mine and RHUI-Growth are effective approaches for mining high utility itemsets in multi-relational data.
引用
收藏
页码:143 / 165
页数:23
相关论文
共 28 条
  • [1] Agrawal R., P 20 INT C VERY LARG
  • [2] Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases
    Ahmed, Chowdhury Farhan
    Tanbeer, Syed Khairuzzaman
    Jeong, Byeong-Soo
    Lee, Young-Koo
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (12) : 1708 - 1721
  • [3] [Anonymous], MULTI RELATIONAL DAT
  • [4] Simple decision forests for multi-relational classification
    Bina, Bahareh
    Schulte, Oliver
    Crawford, Branden
    Qian, Zhensong
    Xiong, Yi
    [J]. DECISION SUPPORT SYSTEMS, 2013, 54 (03) : 1269 - 1279
  • [5] CTU-Mine: An efficient high utility itemset mining algorithm using the pattern growth approach
    Erwin, Alva
    Gopalan, Raj P.
    Achuthan, N. R.
    [J]. 2007 CIT: 7TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, PROCEEDINGS, 2007, : 71 - +
  • [6] Fonseca Nuno A., 2011, International Conference on Inductive Logic Programming, P145
  • [7] Key roles of closed sets and minimal generators in concise representations of frequent patterns
    Hamrouni, Tarek
    [J]. INTELLIGENT DATA ANALYSIS, 2012, 16 (04) : 581 - 631
  • [8] Mining frequent patterns without candidate generation: A frequent-pattern tree approach
    Han, JW
    Pei, J
    Yin, YW
    Mao, RY
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2004, 8 (01) : 53 - 87
  • [9] Jensen VC, 2000, LECT NOTES ARTIF INT, V1805, P49
  • [10] An efficient projection-based indexing approach for mining high utility itemsets
    Lan, Guo-Cheng
    Hong, Tzung-Pei
    Tseng, Vincent S.
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2014, 38 (01) : 85 - 107