Frequent pattern mining-based log file partition for process mining

被引:3
作者
Bantay, Laszlo [1 ]
Abonyi, Janos [1 ]
机构
[1] Univ Pannonia, ELKH PE Complex Syst Monitoring Res Grp, Egyet U 10, H-8200 Veszprem, Hungary
关键词
Frequent itemset mining; Frequent sequential pattern mining; Process mining; Log file pre-processing;
D O I
10.1016/j.engappai.2023.106221
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Process mining is a technique for exploring models based on event sequences, growing in popularity in the process industry. Process mining algorithms assume that the processed log files contain events generated by only one unknown process, which can lead to extremely complex and inaccurate models when this assumption is not met. To address this issue, this article proposes a frequent pattern mining-based method for log file partitioning, allowing for the exploration of parallel processes. The key idea is that frequent pattern mining can identify grouped events and generate sub-logs of overlapping sub-processes. Thanks to the pre-processing of the log files, more compact and interpretable process models can be identified. We developed a set of goal-oriented metrics to evaluate the complexity of process mining problems and the resulting models. The applicability and effectiveness of the method are demonstrated in the analysis of process alarms of an industrial plant. The results confirm that the proposed method enables the discovery of targeted sub-process models by partitioning the log file using frequent pattern mining, and the effectiveness of the method increases with the number of parallel processes stored in the same log file. We recommend applying the method in every case where there is no clear start and end of the logged events so that the log file can describe different processes.
引用
收藏
页数:11
相关论文
共 38 条
[1]  
Berti A., 2019, CEUR Workshop Proceedings, V2371, P87
[2]  
Blum F. R, 2015, Technical Report TR/DCC-2015-6
[3]  
Bolt A., 2015, LECT NOTES BUSINESS, V214, P102, DOI [10.1007/978-3-319-19237-6_7, DOI 10.1007/978-3-319-19237-6_7]
[4]  
Bose RPJC, 2009, LECT NOTES COMPUT SC, V5701, P159, DOI 10.1007/978-3-642-03848-8_12
[5]   Quality Dimensions in Process Discovery: The Importance of Fitness, Precision, Generalization and Simplicity [J].
Buijs, J. C. A. M. ;
van Dongen, B. F. ;
van der Aalst, W. M. P. .
INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS, 2014, 23 (01)
[6]   A systematic mapping study of process mining [J].
Cardenas Maita, Ana Rocio ;
Martins, Lucas Correa ;
Lopez Paz, Carlos Ramon ;
Rafferty, Laura ;
Hung, Patrick C. K. ;
Peres, Sarajane Marques ;
Fantinato, Marcelo .
ENTERPRISE INFORMATION SYSTEMS, 2018, 12 (05) :505-549
[7]   Discovering Infrequent Behavioral Patterns in Process Models [J].
Chapela-Campa, David ;
Mucientes, Manuel ;
Lama, Manuel .
BUSINESS PROCESS MANAGEMENT, BPM 2017, 2017, 10445 :324-340
[8]   Fault template extraction to assist operators during industrial alarm floods [J].
Charbonnier, Sylvie ;
Bouchair, Nabil ;
Gayet, Philippe .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 50 :32-44
[9]   Filtering Out Infrequent Behavior from Business Process Event Logs [J].
Conforti, Raffaele ;
La Rosa, Marcello ;
ter Hofstede, Arthur H. M. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (02) :300-314
[10]  
de San Pedro J., 2016, P 31 ANN ACM S APPL, P839