Refreshing data warehouses with near real-time updates

被引:0
|
作者
Rahman, Nayem [1 ]
机构
[1] Intel Corp, Business Intelligence Serv, Aloha, OR 97002 USA
关键词
data warehouse; near real-time; real-time; observation timestamp; metadata; incremental updates;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In traditional decision support systems, data warehouses have been used to analyze historical information. In the past it was relatively easy to keep data acquisition and maintenance activities to an as-needed basis by using batch windows at night when the business users went home. Now, however, decision makers need up-to-date information to make strategic business decisions, requiring data warehouses to be refreshed several times a day. This paper presents a technical outline for a near real-time decision support system where data warehouses are refreshed using a metadata model and incremental refreshes to increase the frequency of batch cycle runs. We propose a staging area in the data warehouse to capture data updates from external sources. Based on new data in the staging tables, we propose to load the actual analytical tables in the data warehouse using the database system as a transformation engine. We also propose making the database transformation tasks, such as stored procedures execution, metadata driven. The metadata model lets the stored procedures in different business and analytical subject areas run only when source data changes in the source subject area tables, and then implements a delta refresh of tables for which new data has arrived from the operational databases. Skipping unnecessary loads via this metadata-driven approach allows for faster cycle refreshes. The cycle refresh time statistics captured from an actual production data warehouse demonstrate the excellent reductions in cycle times achieved by our batch technique.
引用
收藏
页码:71 / 80
页数:10
相关论文
共 50 条
  • [31] Near real-time streaming analysis of big fusion data
    Kube, R.
    Churchill, R. M.
    Chang, C. S.
    Choi, J.
    Wang, R.
    Klasky, S.
    Stephey, L.
    Dart, E.
    Choi, M. J.
    PLASMA PHYSICS AND CONTROLLED FUSION, 2022, 64 (03)
  • [32] NEAR REAL-TIME SATELLITE DATA QUALITY MONITORING AND CONTROL
    Han, Weiguo
    Jochum, Matthew
    2016 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2016, : 206 - 209
  • [33] Stream Processing For Near Real-Time Scientific Data Analysis
    Choi, Jong Youl
    Kurc, Tahsin
    Logan, Jeremy
    Wolf, Matthew
    Suchyta, Eric
    Kress, James
    Pugmire, David
    Podhorszki, Norbert
    Byun, Eun-Kyu
    Ainsworth, Mark
    Pwashar, Manish
    Klasky, Scott
    2016 NEW YORK SCIENTIFIC DATA SUMMIT (NYSDS), 2016,
  • [34] A Big Data Architecture for Near Real-time Traffic Analytics
    Gong, Yikai
    Rimba, Paul
    Sinnott, Richard O.
    COMPANION PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC'17 COMPANION), 2017, : 157 - 162
  • [35] NEAR REAL-TIME PROCESSING OF PROTEOMICS DATA USING HADOOP
    Hillman, Chris
    Ahmad, Yasmeen
    Whitehorn, Mark
    Cobley, Andy
    BIG DATA, 2014, 2 (01) : 44 - 49
  • [36] Efficient processing of streaming updates with archived master data in near-real-time data warehousing
    Naeem, M. Asif
    Dobbie, Gillian
    Weber, Gerald
    KNOWLEDGE AND INFORMATION SYSTEMS, 2014, 40 (03) : 615 - 637
  • [37] Efficient processing of streaming updates with archived master data in near-real-time data warehousing
    M. Asif Naeem
    Gillian Dobbie
    Gerald Weber
    Knowledge and Information Systems, 2014, 40 : 615 - 637
  • [38] Near real-time big-data processing for data driven applications
    Kampars, Janis
    Grabis, Janis
    2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA INNOVATIONS AND APPLICATIONS (INNOVATE-DATA), 2017, : 35 - 42
  • [39] Near real-time and real-time 4D radiotherapy
    Hoogeman, M.
    RADIOTHERAPY AND ONCOLOGY, 2019, 141 : S42 - S42
  • [40] Real-time squared: A real-time data set for real-time GDP forecasting
    Golinelli, Roberto
    Parigi, Giuseppe
    INTERNATIONAL JOURNAL OF FORECASTING, 2008, 24 (03) : 368 - 385