Data is Moody: Discovering Data Modification Rules from Process Event Logs

被引:0
作者
Schuster, Marco Bjarne [1 ]
Wiegand, Boris [2 ]
Vreeken, Jilles [3 ]
机构
[1] Airbus Operat GmbH, Bremen, Germany
[2] Stahl Holding Saar, Dillingen, Germany
[3] CISPA Helmholtz Ctr Informat Secur, Saarbrucken, Germany
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT II, ECML PKDD 2024 | 2024年 / 14942卷
关键词
Process mining; Rule mining; MDL;
D O I
10.1007/978-3-031-70344-7_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although event logs are a powerful source to gain insight into the behavior of the underlying business process, existing work primarily focuses on finding patterns in the activity sequences of an event log, while ignoring event attribute data. Event attribute data has mostly been used to predict event occurrences and process outcome, but the state of the art neglects to mine succinct and interpretable rules describing how event attribute data changes during process execution. Subgroup discovery and rule-based classification approaches lack the ability to capture the sequential dependencies present in event logs, and thus lead to unsatisfactory results with limited insight into the process behavior. Given an event log, we aim to find accurate yet succinct and interpretable if-then rules how the process modifies data. We formalize the problem in terms of the Minimum Description Length (MDL) principle, by which we choose the model with the best lossless description of the data. Additionally, we propose the greedy Moody algorithm to efficiently search for rules. By extensive experiments on both synthetic and real-world data, we show Moody indeed finds compact and interpretable rules, needs little data for accurate discovery, and is robust to noise.
引用
收藏
页码:285 / 302
页数:18
相关论文
共 42 条
[1]  
AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
[2]   Split Miner: Discovering Accurate and Simple Business Process Models from Event Logs [J].
Augusto, Adriano ;
Conforti, Raffaele ;
Dumas, Marlon ;
La Rosa, Marcello .
2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, :1-10
[3]   Dealing With Concept Drifts in Process Mining [J].
Bose, R. P. Jagadeesh Chandra ;
van der Aalst, Wil M. P. ;
Zliobaite, Indre ;
Pechenizkiy, Mykola .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (01) :154-171
[4]  
Budhathoki K., 2021, P 2021 SIAM INT C DA, P1, DOI DOI 10.1137/1.9781611976700.1
[5]  
Cormen Thomas H., 2009, Introduction to Algorithms, V3rd
[6]  
Cüppers J, 2024, AAAI CONF ARTIF INTE, P8346
[7]   STATISTICAL-THEORY - THE PREQUENTIAL APPROACH [J].
DAWID, AP .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1984, 147 :278-292
[8]  
De Leoni M, 2015, 4TU.ResearchData
[9]   Differentiable Pattern Set Mining [J].
Fischer, Jonas ;
Vreeken, Jilles .
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, :383-392
[10]  
Foster M., 2021, ICTSS, P37