Multi-task prediction method of business process based on BERT and Transfer Learning

被引:21
作者
Chen, Hang [1 ,2 ]
Fang, Xianwen [1 ,2 ]
Fang, Huan [1 ]
机构
[1] Anhui Univ Sci & Technol, Sch Math & Big Data, Huainan, Peoples R China
[2] Anhui Prov Engn Lab Big Data Anal & Early Warning, Huainan, Peoples R China
关键词
Predictive business process monitoring; Transfer Learning; Transformer; BERT; Masked Activity Model; NEURAL-NETWORKS; CLASSIFIERS;
D O I
10.1016/j.knosys.2022.109603
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Predictive Business Process Monitoring (PBPM) is one of the essential tasks in Business Process Management (BPM). It aims to predict the future behavior of an ongoing case using completed cases of a process stored in the event log, such as the prediction of the next activity and outcome of the case, etc. Although various deep learning methods have been proposed for PBPM, none of them consider the simultaneous application to multiple predictive tasks. This paper proposes a multi-task prediction method based on BERT and Transfer Learning. First, the method performs the Masked Activity Model (MAM) of a self-supervised pre-training task on many unlabeled traces using BERT (Bidirectional Encoder Representations from Transformers). The pre-training task MAM captures the bidirectional semantic information of the input traces using the bidirectional Transformer structure in BERT. It obtains the long-term dependencies between activities using the Attention mechanism in the Transformer. Then, the universal representation model of the traces is obtained. Finally, two different models are defined for two prediction tasks of the next activity and the outcome of the case, respectively, and the pre-trained model is transferred to the two prediction models for training using the fine-tuning strategy. Experiments evaluation on eleven real-world event logs shows that the performance of the prediction tasks is affected by different masking tactics and masking probabilities in the pre-training task MAM. This method performs well in the next activity prediction task and the case outcome prediction task. It can be applied to several different prediction tasks faster and with more outstanding performance than the direct training method. (C) 2022 Published by Elsevier B.V.
引用
收藏
页数:15
相关论文
共 34 条
  • [1] Bergstra J., 2011, P 2011 ANN C NEURAL, V24, DOI DOI 10.5555/2986459.2986743
  • [2] Bergstra J, 2012, J MACH LEARN RES, V13, P281
  • [3] The use of the area under the roc curve in the evaluation of machine learning algorithms
    Bradley, AP
    [J]. PATTERN RECOGNITION, 1997, 30 (07) : 1145 - 1159
  • [4] Bukhsh ZA, 2021, Arxiv, DOI arXiv:2104.00721
  • [5] Learning Accurate LSTM Models of Business Processes
    Camargo, Manuel
    Dumas, Marlon
    Gonzalez-Rojas, Oscar
    [J]. BUSINESS PROCESS MANAGEMENT (BPM 2019), 2019, 11675 : 286 - 302
  • [6] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [7] Demsar J, 2006, J MACH LEARN RES, V7, P1
  • [8] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
  • [9] Activity Prediction of Business Process Instances with Inception CNN Models
    Di Mauro, Nicola
    Appice, Annalisa
    Basile, Teresa M. A.
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, AI*IA 2019, 2019, 11946 : 348 - 361
  • [10] A Deep Learning Approach for Predicting Process Behaviour at Runtime
    Evermann, Joerg
    Rehse, Jana-Rebecca
    Fettke, Peter
    [J]. BUSINESS PROCESS MANAGEMENT WORKSHOPS, BPM 2016, 2017, 281 : 327 - 338