Mining Time-constrained Sequential Patterns with Constraint Programming

被引:0
作者
John O. R. Aoga
Tias Guns
Pierre Schaus
机构
[1] Université catholique de Louvain (UCLouvain),Institute of Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM)
[2] Université d’Abomey-Calavi (UAC),Ecole Doctorale Science de l’Ingénieur (ED
[3] Vrije Universiteit Brussel (VUB),SDI)
[4] Katholieke Universiteit Leuven,undefined
来源
Constraints | 2017年 / 22卷
关键词
Data mining; Sequential pattern mining; Constraint programming; Global constraint; Gap constraint; Span constraint; Time constraint;
D O I
暂无
中图分类号
学科分类号
摘要
Constraint Programming (CP) has proven to be an effective platform for constraint based sequence mining. Previous work has focused on standard frequent sequence mining, as well as frequent sequence mining with a maximum ’gap’ between two matching events in a sequence. The main challenge in the latter is that this constraint can not be imposed independently of the omnipresent frequency constraint. Indeed, the gap constraint changes whether a subsequence is included in a sequence, and hence its frequency. In this work, we go beyond that and investigate the integration of timed events and constraining the minimum/maximum gap as well as minimum/maximum span. The latter constrains the allowed time between the first and last matching event of a pattern. We show how the three are interrelated, and what the required changes to the frequency constraint are. Key in our approach is the concept of an extension window defined by gap/span and we develop techniques to avoid scanning the sequences needlessly, as well as using a backtracking-aware data structure. Experiments demonstrate that the proposed approach outperforms both specialized and CP-based approaches in almost all cases and that the advantage increases as the minimum frequency threshold decreases. This paper is an extension of the original manuscript presented at CPAIOR’17 [5].
引用
收藏
页码:548 / 570
页数:22
相关论文
共 29 条
[1]  
Beldiceanu N(1994)Introducing global constraints in chip Mathematical and computer Modelling 20 97-123
[2]  
Contejean E(2015)Efficient constraint-based sequential pattern mining (spm) algorithm to understand customers buying behaviour from time stamp-based sequence dataset Cogent Engineering 2 1072,292-418
[3]  
Desai NAK(2013)k-pattern set mining under constraints IEEE Transactions on Knowledge and Data Engineering 25 402-87
[4]  
Ganatra A(2004)Mining frequent patterns without candidate generation: a frequent-pattern tree approach Data mining and knowledge discovery 8 53-144
[5]  
Guns T(2014)Bicspam: flexible biclustering using sequential patterns BMC Bioinformatics 15 130-306
[6]  
Nijssen S(2010)Grammar constraints Constraints 15 117-289
[7]  
De Raedt L(2017)Prefix-projection global constraint and top-k approach for sequential pattern mining Constraints 22 265-160
[8]  
Han J(1997)Discovery of frequent episodes in event sequences Data mining and knowledge discovery 1 259-1056
[9]  
Pei J(2007)Constraint-based sequential pattern mining: the pattern-growth methods Journal of Intelligent Information Systems 28 133-undefined
[10]  
Yin Y(2007)Frequent closed sequence mining without candidate maintenance IEEE Transactions on Knowledge and Data Engineering 19 1042-undefined