Applying Sequence Mining for Outlier Detection in Process Mining

被引:27
作者
Sani, Mohammadreza Fani [1 ]
Van Zelst, Sebastiaan J. [2 ]
Van der Aalst, Wil M. P. [1 ,2 ]
机构
[1] Rhein Westfal TH Aachen, Proc & Data Sci Chair, D-52056 Aachen, Germany
[2] Rhein Westfal TH Aachen, Proc & Data Sci Chair, D-52056 Aachen, Germany
来源
ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS (OTM 2018), PT II | 2018年 / 11230卷
关键词
Process mining; Sequence mining; Event log filtering; Event log preprocessing; Sequential rule mining; Outlier detection; PROCESS MODELS; PROCESS DISCOVERY; LOGS;
D O I
10.1007/978-3-030-02671-4_6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the challenges in applying process mining algorithms on real event data, is the presence of outlier behavior. Such behaviour often leads to complex, incomprehensible, and, sometimes, even inaccurate process mining results. As a result, correct and/or important behaviour of the process may be concealed. In this paper, we exploit sequence mining techniques for the purpose of outlier detection in the process mining domain. Using the proposed approach, it is even possible to detect outliers in case of heavy parallelism and/or long-term dependencies between business process activities. Our method has been implemented in both the ProM- and the RapidProM framework. Using these implementations, we conducted a collection of experiments that show that we are able to detect and remove outlier behaviour in event data. Our evaluation clearly demonstrates that the proposed method accurately removes outlier behaviour and, indeed, improves process discovery results.
引用
收藏
页码:98 / 116
页数:19
相关论文
共 30 条
  • [1] [Anonymous], 1981, Petri net theory and the modeling of systems
  • [2] Scientific workflows for process mining: building blocks, scenarios, and implementation
    Bolt, Alfredo
    de Leoni, Massimiliano
    van der Aalst, Wil M. P.
    [J]. INTERNATIONAL JOURNAL ON SOFTWARE TOOLS FOR TECHNOLOGY TRANSFER, 2016, 18 (06) : 607 - 628
  • [3] Buijs J.C.A.M., 2012, On the Move to Meaningful Internet Systems: OTM 2012, P305, DOI DOI 10.1007/978-3-642-33606-5_19
  • [4] Anomaly Detection for Discrete Sequences: A Survey
    Chandola, Varun
    Banerjee, Arindam
    Kumar, Vipin
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (05) : 823 - 839
  • [5] Process mining on noisy logs - Can log sanitization help to improve performance?
    Cheng, Hsin-Jung
    Kumar, Akhil
    [J]. DECISION SUPPORT SYSTEMS, 2015, 79 : 138 - 149
  • [6] Filtering Out Infrequent Behavior from Business Process Event Logs
    Conforti, Raffaele
    La Rosa, Marcello
    ter Hofstede, Arthur H. M.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (02) : 300 - 314
  • [7] De Weerdt J., 2011, Proceedings 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011), P148, DOI 10.1109/CIDM.2011.5949428
  • [8] Active Trace Clustering for Improved Process Discovery
    De Weerdt, Jochen
    Vanden Broucke, Seppe
    Vanthienen, Jan
    Baesens, Bart
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (12) : 2708 - 2720
  • [9] Mining Partially-Ordered Sequential Rules Common to Multiple Sequences
    Fournier-Viger, Philippe
    Wu, Cheng-Wei
    Tseng, Vincent S.
    Cao, Longbing
    Nkambou, Roger
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (08) : 2203 - 2216
  • [10] Fournier-Viger P, 2014, J MACH LEARN RES, V15, P3389