Genetic process mining: an experimental evaluation

被引:0
作者
A. K. A. de Medeiros
A. J. M. M. Weijters
W. M. P. van der Aalst
机构
[1] Eindhoven University of Technology,Department of Technology Management
来源
Data Mining and Knowledge Discovery | 2007年 / 14卷
关键词
Process mining; Genetic mining; Genetic algorithms; Petri nets; Workflow nets;
D O I
暂无
中图分类号
学科分类号
摘要
One of the aims of process mining is to retrieve a process model from an event log. The discovered models can be used as objective starting points during the deployment of process-aware information systems (Dumas et al., eds., Process-Aware Information Systems: Bridging People and Software Through Process Technology. Wiley, New York, 2005) and/or as a feedback mechanism to check prescribed models against enacted ones. However, current techniques have problems when mining processes that contain non-trivial constructs and/or when dealing with the presence of noise in the logs. Most of the problems happen because many current techniques are based on local information in the event log. To overcome these problems, we try to use genetic algorithms to mine process models. The main motivation is to benefit from the global search performed by this kind of algorithms. The non-trivial constructs are tackled by choosing an internal representation that supports them. The problem of noise is naturally tackled by the genetic algorithm because, per definition, these algorithms are robust to noise. The main challenge in a genetic approach is the definition of a good fitness measure because it guides the global search performed by the genetic algorithm. This paper explains how the genetic algorithm works. Experiments with synthetic and real-life logs show that the fitness measure indeed leads to the mining of process models that are complete (can reproduce all the behavior in the log) and precise (do not allow for extra behavior that cannot be derived from the event log). The genetic algorithm is implemented as a plug-in in the ProM framework.
引用
收藏
页码:245 / 304
页数:59
相关论文
共 47 条
[1]  
van der Aalst WMP(2003)Workflow mining: a survey of issues and approaches Data Knowl Eng 47 237-267
[2]  
van Dongen BF(2004)Workflow mining: discovering process models from event logs IEEE Trans Knowl Data Eng 16 1128-1142
[3]  
Herbst J(1983)Inductive inference: theory and methods Comput Surv 15 237-269
[4]  
Maruster L(2004)Discovering models of behavior for concurrent workflows Comput Ind 53 297-319
[5]  
Schimm G(1999)Software process validation: quantitatively measuring the correspondence of a process to a model ACM Trans Softw Eng Methodol 8 147-176
[6]  
Weijters AJMM(1998a)Discovering models of software processes from event-based data ACM Trans Softw Eng Methodol 7 215-249
[7]  
van der Aalst WMP(1996)Branching time and abstraction in bisimulation semantics J ACM 43 555-600
[8]  
Weijters AJMM(1978)Complexity of automaton identification from given data Inform Control 37 302-320
[9]  
Maruster L(2006)Discovering expressive process models by clustering log traces IEEE Trans Knowl Data Eng 18 1010-1027
[10]  
Angluin D(2000)Integrating machine learning and workflow management to support acquisition and adaptation of workflow models Int J Intell Syst Account Finance Manag 9 67-92