k-Pattern Set Mining under Constraints

被引:60
作者
Guns, Tias [1 ]
Nijssen, Siegfried [1 ]
De Raedt, Luc [1 ]
机构
[1] Katholieke Univ Leuven, Dept Comp Sci, B-3000 Louvain, Belgium
关键词
Data mining; pattern set mining; constraints; constraint programming;
D O I
10.1109/TKDE.2011.204
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce the problem of k-pattern set mining, concerned with finding a set of k related patterns under constraints. This contrasts to regular pattern mining, where one searches for many individual patterns. The k-pattern set mining problem is a very general problem that can be instantiated to a wide variety of well-known mining tasks including concept-learning, rule-learning, redescription mining, conceptual clustering and tiling. To this end, we formulate a large number of constraints for use in k-pattern set mining, both at the local level, that is, on individual patterns, and on the global level, that is, on the overall pattern set. Building general solvers for the pattern set mining problem remains a challenge. Here, we investigate to what extent constraint programming (CP) can be used as a general solution strategy. We present a mapping of pattern set constraints to constraints currently available in CP. This allows us to investigate a large number of settings within a unified framework and to gain insight in the possibilities and limitations of these solvers. This is important as it allows us to create guidelines in how to model new problems successfully and how to model existing problems more efficiently. It also opens up the way for other solver technologies.
引用
收藏
页码:402 / 418
页数:17
相关论文
共 39 条
  • [1] Afrati F., 2004, P KDD, P12, DOI DOI 10.1145/1014052.1014057
  • [2] [Anonymous], SIAM INT C DAT MIN S
  • [3] [Anonymous], 2000, Proceedings of the 19th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS'00)
  • [4] Bayardo R. J. Jr., 1998, SIGMOD Record, V27, P85, DOI 10.1145/276305.276313
  • [5] Bing Liu, 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P80
  • [6] Blockeel H, 2008, P 14 ACM SIGKDD INT, P1061, DOI [10.1145/1401890.1402019, DOI 10.1145/1401890.1402019]
  • [7] A constraint-based querying system for exploratory pattern discovery
    Bonchi, Francesco
    Giannotti, Fosca
    Lucchese, Claudio
    Orlando, Salvatore
    Perego, Raffaele
    Trasarti, Roberto
    [J]. INFORMATION SYSTEMS, 2009, 34 (01) : 3 - 27
  • [8] Bringmann B., 2009, P LEGO LOC PATT GLOB
  • [9] Bringmann B, 2007, IEEE DATA MINING, P63, DOI 10.1109/ICDM.2007.85
  • [10] Non-derivable itemset mining
    Calders, Toon
    Goethals, Bart
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2007, 14 (01) : 171 - 206