Scientific Workflow Mining in Clouds

被引:28
作者
Song, Wei [1 ]
Chen, Fangfei [1 ]
Jacobsen, Hans-Arno [2 ]
Xia, Xiaoxu [1 ]
Ye, Chunyang [3 ]
Ma, Xiaoxing [4 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Jiangsu, Peoples R China
[2] Univ Toronto, Middleware Syst Res Grp, Toronto, ON M5S 3G4, Canada
[3] Hainan Univ, Coll Informat Sci & Technol, Haikou 570228, Hainan, Peoples R China
[4] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
基金
加拿大自然科学与工程研究理事会; 中国国家自然科学基金;
关键词
Scientific workflow; inter-cloud; workflow mining; event log; direct precedence; PROCESS MODELS; CONFORMANCE CHECKING; RESOURCE; HISTORY;
D O I
10.1109/TPDS.2017.2696942
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Computing clouds have become the platform of choice for the deployment and execution of scientific workflows. Due to the uncertainty and unpredictability of scientific exploration, the execution plan for a scientific workflow may vary from the definition. It is therefore of great significance to be able to discover actual workflows from execution histories (event logs) to reproduce experimental results and to establish provenance. However, most existing process mining techniques focus on discovering control flow-oriented business processes in a centralized environment, and thus, they are mostly inapplicable to the discovery of data flow-oriented, unstructured scientific workflows in distributed cloud environments. In this paper, we present Scientific Workflow Mining as a Service (SWMaaS) to support both intra-cloud and inter-cloud scientific workflow mining. The approach is implemented as a ProM plug-in and is evaluated on event logs derived from real-world scientific workflows. Through experimental results, we demonstrate the effectiveness and efficiency of our approach.
引用
收藏
页码:2979 / 2992
页数:14
相关论文
共 51 条
[1]   Conformance Checking using Cost-Based Fitness Analysis [J].
Adriansyah, A. ;
van Dongen, B. F. ;
van der Aalst, W. M. P. .
15TH IEEE INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE (EDOC 2011), 2011, :55-64
[2]  
Aho A. V., 1972, SIAM Journal on Computing, V1, P131, DOI 10.1137/0201008
[3]  
Alpaydin E, 2014, ADAPT COMPUT MACH LE, P1
[4]  
[Anonymous], 2007, WEB SERV BUS PROC EX
[5]  
[Anonymous], 2008, P 2008 ACM SIGMOD IN
[6]  
Bergenthum R, 2007, LECT NOTES COMPUT SC, V4714, P375
[7]  
Buijs J.C.A.M., 2012, On the Move to Meaningful Internet Systems: OTM 2012, P305, DOI DOI 10.1007/978-3-642-33606-5_19
[8]   Meeting Deadlines of Scientific Workflows in Public Clouds with Tasks Replication [J].
Calheiros, Rodrigo N. ;
Buyya, Rajkumar .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2014, 25 (07) :1787-1796
[9]   An AREA-Oriented Heuristic for Scheduling DAGs on Volatile Computing Platforms [J].
Cordasco, Gennaro ;
De Chiara, Rosario ;
Rosenberg, Arnold L. .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (08) :2164-2177
[10]   The design and realisation of the myExperiment Virtual Research Environment for social sharing of workflows [J].
De Roure, David ;
Goble, Carole ;
StevenS, Robert .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2009, 25 (05) :561-567