An Anti-Noise Process Mining Algorithm Based on Minimum Spanning Tree Clustering

被引:18
作者
Li, Weimin [1 ]
Zhu, Heng [1 ]
Liu, Wei [1 ]
Chen, Dehua [2 ]
Jiang, Jiulei [3 ]
Jin, Qun [4 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Technol, Shanghai 200240, Peoples R China
[2] Donghua Univ, Sch Comp Sci & Technol, Shanghai 200336, Peoples R China
[3] Beifang Univ Nationalities, Sch Comp, Yinchuan 750021, Peoples R China
[4] Waseda Univ, Fac Human Sci, Tokorozawa, Saitama 1698050, Japan
基金
中国国家自然科学基金;
关键词
Process mining; noise; business process; event log; BPM; DISCOVERING PROCESS MODELS; WORKFLOW;
D O I
10.1109/ACCESS.2018.2865540
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many human-centric systems have begun to use business process management technology in production. With the operation of business process management systems, more and more business process logs and human-centric data have been accumulated. However, the effective utilization and analysis of these event logs are challenges that people need to solve urgently. Process mining technology is a branch of business process management technology. It can extract process knowledge from event logs and build process models, which helps to detect and improve business processes. The current process mining algorithms are inadequate in dealing with log noise. The family of alpha-algorithms ignores the impact of noise, which is unrealistic in real-life logs. Most of the process mining algorithms that can handle noise also lack reasonable denoising thresholds. In this paper, a new assumption on noise is given. Furthermore, an anti-noise process mining algorithm that can deal with noise is proposed. The decision rules of the selective, parallel, and non-free choice structures are also given. The proposed algorithm framework discovers the process model and transforms it into a Petri network representation. We calculate the distance between traces to build the minimum spanning tree on which clusters are generated. The traces of the non-largest clusters are treated as noise, and the largest cluster is mined. Finally, the algorithm can discover the regular routing structure and solve the problem of noise. The experimental results show the correctness of the algorithm when compared with the alpha++ algorithm.
引用
收藏
页码:48756 / 48764
页数:9
相关论文
共 28 条
[1]  
Agrawal R., 1998, PROC 6 INT C EXTENDI, V1377, P467
[2]  
[Anonymous], 2001, Proceedings of the 13th Belgium-Netherlands Conference on Artificial Intelligence (BNAIC 2001)
[3]  
Bose R. P. Jagadeesh Chandra, 2011, Advanced Information Systems Engineering. Proceedings 23rd International Conference, CAiSE 2011, P391, DOI 10.1007/978-3-642-21640-4_30
[4]  
Bose RPJC, 2010, LECT NOTES COMPUT SC, V6336, P227
[5]  
Buijs J., 2012, Evolutionary Computation (CEC), 2012 IEEE Congress on, P1, DOI DOI 10.1109/CEC.2012.6256458
[6]  
Cook J. E., 1998, Software Engineering Notes, V23, P35, DOI 10.1145/291252.288214
[7]   Genetic process mining: an experimental evaluation [J].
de Medeiros, A. K. A. ;
Weijters, A. J. M. M. ;
van der Aalst, W. M. P. .
DATA MINING AND KNOWLEDGE DISCOVERY, 2007, 14 (02) :245-304
[8]  
de Medeiros AKA, 2003, LECT NOTES COMPUT SC, V2888, P389
[9]  
De Medeiros AKA, 2004, PROCESS MINING EXTEN, P1
[10]   Implementations of the HL7 Context-Aware Knowledge Retrieval ("Infobutton") Standard: Challenges, strengths, limitations, and uptake [J].
Del Fiol, Guilherme ;
Huser, Vojtech ;
Strasberg, Howard R. ;
Maviglia, Saverio M. ;
Curtis, Clayton ;
Cimino, James J. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2012, 45 (04) :726-735