A Process Mining Approach for Discovering ETL Black Points

被引:4
作者
Belo, Orlando [1 ]
Dias, Nuno [1 ]
Ferreira, Carlos [1 ]
Pinto, Filipe [1 ]
机构
[1] Univ Minho, ALGORITMI R&D Ctr, Dept Informat, Braga, Portugal
来源
RECENT ADVANCES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 2 | 2017年 / 570卷
关键词
Data warehousing systems; ETL processes; Process mining; ETL efficiency and optimization; ETL black points;
D O I
10.1007/978-3-319-56538-5_43
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
ETL tasks are quite complex often leading to a very complex network of working processes. Many difficulties of their development come from the number of sources of information we need to work, the heterogeneity and dispersion of data, and from the complexity of the tasks to implement, in order to populate appropriately a data warehouse. Thus, it is not difficult to occur some undesirable situations related to ETL system design errors or to the implementation of faulty or inefficient tasks. Many of these situations are only detectable at run time. In this paper, we discuss in particular the case of ETL bottleneck situations - ETL black points -, which can occur during the execution of an ETL system, identifying them and characterizing them using process mining. Based on the process mining results analysis, it is possible to develop alternative implementations for inefficient tasks and improve the overall system performance.
引用
收藏
页码:426 / 435
页数:10
相关论文
共 14 条
  • [1] Bose RPJC, 2013, 2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), P127, DOI 10.1109/CIDM.2013.6597227
  • [2] Hompes B., 2015, P 27 BEN C ART INT B, V11
  • [3] Ingvaldsen JE, 2008, LECT NOTES COMPUT SC, V4928, P30
  • [4] Kimball R., 2004, DATA WAREHOUSE ETL T
  • [5] Kimball Ralph., 2008, The Data Warehouse Lifecycle Toolkit, VSecond
  • [6] Application of Process Mining in Healthcare - A Case Study in a Dutch Hospital
    Mans, R. S.
    Schonenberg, M. H.
    Song, A.
    van der Aalst, W. M. P.
    Bakker, P. J. M.
    [J]. BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, 2008, 25 : 425 - +
  • [7] Oliveira B., 2013, 3 INT C MOD DAT ENG
  • [8] Pentaho, PENT DAT INT
  • [9] Analyzing Multi-agent Activity Logs Using Process Mining Techniques
    Rozinat, A.
    Zickler, S.
    Veloso, M.
    van der Aalst, W. M. P.
    McMillen, C.
    [J]. DISTRIBUTED AUTONOMOUS ROBOTIC SYSTEMS 8, 2009, : 251 - +
  • [10] Song M, 2009, LECT NOTES BUS INF P, V17, P109