ProcessAtlas: A scalable and extensible platform for business process analytics

被引:15
作者
Beheshti, Amin [1 ,2 ]
Benatallah, Boualem [1 ]
Motahari-Nezhad, Hamid Reza [1 ,3 ]
机构
[1] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW 2052, Australia
[2] Macquarie Univ, Dept Comp, Sydney, NSW, Australia
[3] IBM Almaden Res Ctr, San Jose, CA USA
关键词
business processes; data-centric process services; process analytics; process data curation; PROVENANCE;
D O I
10.1002/spe.2558
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In today's knowledge-, service-, and cloud-based economy, an overwhelming amount of business-related data are being generated at a fast rate daily from a wide range of sources. These data increasingly show all the typical properties of big data: wide physical distribution, diversity of formats, nonstandard data models, and independently managed and heterogeneous semantics. In this context, there is a need for new scalable and process-aware services for querying, exploration, and analysis of process data in the enterprise because (1) process data analysis services should be capable of processing and querying large amount of data effectively and efficiently and, therefore, have to be able to scale well with the infrastructure's scale and (2) the querying services need to enable users to express their data analysis and querying needs using process-aware abstractions rather than other lower-level abstractions. In this paper, we introduce ProcessAtlas, ie, an extensible large-scale process data querying and analysis platform for analyzing process data in the enterprise. The ProcessAtlas platform offers an extensible architecture by adopting a service-based model so that new analytical services can be plugged into the platform. In ProcessAtlas, we present a domain-specific model for representing process knowledge, ie, process-level entities, abstractions, and the relationships among them modeled as graphs. We provide services for discovering, extracting, and analyzing process data. We provide efficient mapping and execution of process-level queries into graph-level queries by using scalable process query services to deal with the process data size growth and with the infrastructure's scale. We have implemented ProcessAtlas as a MapReduce-based prototype and report on experiments performed on both synthetic and real-world datasets.
引用
收藏
页码:842 / 866
页数:25
相关论文
共 53 条
  • [1] Aalst W.M. P., 2011, PROCESS MINING DISCO
  • [2] Aalst WMPVD, 2011, BUS PROC MAN WORKSH
  • [3] Aalst WMPVD, 2003, P INT C BUS PROC MAN
  • [4] Aalst WMPVD, 2009, P BUS PROC MAN DEM T
  • [5] Aalst WMPVD, 2012, ACM T MANAGE, V3
  • [6] Aggarwal CC, 2010, ADV DATABASE SYST, V40, P1, DOI 10.1007/978-1-4419-6045-0
  • [7] Allahbakhsh M, 2012, 2012 8 INT C COLL CO
  • [8] [Anonymous], 2010, ACM SIGMOD INT C MAN
  • [9] [Anonymous], 1984, Introduction to Modern Information Retrieval
  • [10] [Anonymous], 2011, BIG DATA NEXT FRONTI