Mining Behavioral Sequence Constraints for Classification

被引:26
作者
De Smedt, Johannes [1 ]
Deeva, Galina [2 ]
De Weerdt, Jochen [2 ]
机构
[1] Univ Edinburgh, Edinburgh EH8 9YL, Midlothian, Scotland
[2] Katholieke Univ Leuven, B-3000 Leuven, Belgium
关键词
Databases; Data mining; Feature extraction; Distance measurement; Knowledge based systems; Task analysis; Sequence classification; sequential pattern mining; behavioral constraint templates; Declare; FREQUENT; PATTERNS; DISCOVERY;
D O I
10.1109/TKDE.2019.2897311
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sequence classification deals with the task of finding discriminative and concise sequential patterns. To this purpose, many techniques have been proposed, which mainly resort to the use of partial orders to capture the underlying sequences in a database according to the labels. Partial orders, however, pose many limitations, especially on expressiveness, i.e., the aptitude towards capturing certain behavior, and on conciseness, i.e., doing so in a compact and informative way. These limitations can be addressed by using a better representation. In this paper, we present the interesting Behavioral Constraint Miner (iBCM), a sequence classification technique that discovers patterns using behavioral constraint templates. The templates comprise a variety of constraints and can express patterns ranging from simple occurrence, to looping and position-based behavior over a sequence. Furthermore, iBCM also captures negative constraints, i.e., absence of particular behavior. The constraints can be discovered by using simple string operations in an efficient way. Finally, deriving the constraints with a window-based approach allows to pinpoint where the constraints hold in a string, and to detect whether patterns are subject to concept drift. Through empirical evaluation, it is shown that iBCM is better capable of classifying sequences more accurately and concisely in a scalable manner.
引用
收藏
页码:1130 / 1142
页数:13
相关论文
共 43 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]   Mining Time-constrained Sequential Patterns with Constraint Programming [J].
Aoga, John O. R. ;
Guns, Tias ;
Schaus, Pierre .
CONSTRAINTS, 2017, 22 (04) :548-570
[3]   Mining weighted sequential patterns in a sequence database with a time-interval weight [J].
Chang, Joong Hyuk .
KNOWLEDGE-BASED SYSTEMS, 2011, 24 (01) :1-9
[4]  
Cheng H, 2007, PROC INT CONF DATA, P691
[5]   A SAT-Based Approach for Discovering Frequent, Closed and Maximal Patterns in a Sequence [J].
Coquery, Emmanuel ;
Jabbour, Said ;
Sais, Lakhdar ;
Salhi, Yakoub .
20TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2012), 2012, 242 :258-+
[6]  
Cule B, 2010, LECT NOTES ARTIF INT, V6118, P300
[7]   Behavioral Constraint Template-Based Sequence Classification [J].
De Smedt, Johannes ;
Deeva, Galina ;
De Weerdt, Jochen .
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT II, 2017, 10535 :20-36
[8]   Discovery of frequent DATALOG patterns [J].
Dehaspe, L ;
Toivonen, H .
DATA MINING AND KNOWLEDGE DISCOVERY, 1999, 3 (01) :7-36
[9]   Efficient discovery of Target-Branched Declare constraints [J].
Di Ciccio, Claudio ;
Maggi, Fabrizio Maria ;
Mendling, Jan .
INFORMATION SYSTEMS, 2016, 56 :258-283
[10]  
Di Ciccio C, 2013, 2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), P135, DOI 10.1109/CIDM.2013.6597228