Constraint-based sequential pattern mining: the pattern-growth methods

被引:139
作者
Pei, Jian [1 ]
Han, Jiawei
Wang, Wei
机构
[1] Simon Fraser Univ, Sch Comp Sci, Burnaby, BC V5A 1S6, Canada
[2] Univ Illinois, Urbana, IL 61801 USA
[3] Fudan Univ, Shanghai 200433, Peoples R China
基金
美国国家科学基金会; 加拿大自然科学与工程研究理事会; 中国国家自然科学基金;
关键词
sequential pattern mining; frequent pattern mining; mining with constraints; pattern-growth methods;
D O I
10.1007/s10844-006-0006-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Constraints are essential for many sequential pattern mining applications. However, there is no systematic study on constraint-based sequential pattern mining. In this paper, we investigate this issue and point out that the framework developed for constrained frequent-pattern mining does not fit our mission well. An extended framework is developed based on a sequential pattern growth methodology. Our study shows that constraints can be effectively and efficiently pushed deep into the sequential pattern mining under this new framework. Moreover, this framework can be extended to constraint-based structured pattern mining as well.
引用
收藏
页码:133 / 160
页数:28
相关论文
共 24 条
  • [1] Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
  • [2] AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
  • [3] Agrawal R., 1994, Proceedings of the 20th International Conference on Very Large Data Bases. VLDB'94, P487
  • [4] [Anonymous], P ACM SIGMOD 98
  • [5] [Anonymous], 1996, EDBT, DOI 10.1007/BFb0014140
  • [6] [Anonymous], 2000, P 6 ACM SIGKDD INT C
  • [7] Ayres J., 2002, Proceedings of the 8th ACM International Conference on Knowledge Discovery and Data Mining, P429, DOI 10.1145/775047.775109
  • [8] Constraint-based rule mining in large, dense databases
    Bayardo, RJ
    Agrawal, R
    Gunopulos, D
    [J]. 15TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1999, : 188 - 197
  • [9] Chiu D, 2004, P 20 IEEE INT C DAT, P275
  • [10] Garofalakis MN, 1999, PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P223