SeqScout: Using a Bandit Model to Discover Interesting Subgroups in Labeled Sequences

被引:7
作者
Mathonat, Romain [1 ,2 ]
Nurbakova, Diana [1 ]
Boulicaut, Jean-Francois [1 ]
Kaytoue, Mehdi [1 ,3 ]
机构
[1] Univ Lyon, CNRS, INSA Lyon, LIRIS,UMR5205, F-69621 Villeurbanne, France
[2] ATOS, F-69100 Villeurbanne, France
[3] Infologic, F-69007 Lyon, France
来源
2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019) | 2019年
关键词
Pattern Mining; Subgroup Discovery; Sequences; Upper Confidence Bound; PATTERN; ALGORITHM; SET;
D O I
10.1109/DSAA.2019.00022
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It is extremely useful to exploit labeled datasets not only to learn models but also to improve our understanding of a domain and its available targeted classes. The so-called subgroup discovery task has been considered for a long time. It concerns the discovery of patterns or descriptions, the set of supporting objects of which have interesting properties, e.g., they characterize or discriminate a given target class. Though many subgroup discovery algorithms have been proposed for transactional data, discovering subgroups within labeled sequential data and thus searching for descriptions as sequential patterns has been much less studied. In that context, exhaustive exploration strategies can not be used for real-life applications and we have to look for heuristic approaches. We propose the algorithm SeqScout to discover interesting subgroups (w.r.t. a chosen quality measure) from labeled sequences of itemsets. This is a new sampling algorithm that mines discriminant sequential patterns using a multi-armed bandit model. It is an anytime algorithm that, for a given budget, finds a collection of local optima in the search space of descriptions and thus subgroups. It requires a light configuration and it is independent from the quality measure used for pattern scoring. Furthermore, it is fairly simple to implement. We provide qualitative and quantitative experiments on several datasets to illustrate its added-value.
引用
收藏
页码:81 / 90
页数:10
相关论文
共 29 条
[1]  
Agrawal R., P IEEE ICDE 1995, P3
[2]  
[Anonymous], 2012, REGRET ANAL STOCHAST
[3]  
[Anonymous], 2009, Artificial intelligence-A modern approach
[4]  
Atzmller M., P PKDD 2006, P6
[5]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[6]  
Ayres J., P ACM SIGKDD 2002, P429
[7]  
Belfodil A., P ECML PKDD 2018 2, P500
[8]  
Boley M., P ACM SIGKDD 2011, P582
[9]   Anytime discovery of a diverse set of patterns with Monte Carlo tree search [J].
Bosc, Guillaume ;
Boulicaut, Jean-Francois ;
Raissi, Chedy ;
Kaytoue, Mehdi .
DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 32 (03) :604-650
[10]   A Pattern Mining Approach to Study Strategy Balance in RTS Games [J].
Bosc, Guillaume ;
Tan, Philip ;
Boulicaut, Jean-Francois ;
Raissi, Chedy ;
Kaytoue, Mehdi .
IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2017, 9 (02) :123-132