A Comparative Analysis of Data Warehouse Data Models

被引：0

作者：

Bojicic, Ivan ^{[1
]}

Marjanovic, Zoran ^{[1
]}

Turajlic, Nina ^{[1
]}

Petrovic, Marko ^{[1
]}

Vuckovic, Milica ^{[1
]}

Jovanovic, Vladan ^{[2
]}

机构：

[1] Univ Belgrade, Fac Org Sci, Belgrade, Serbia

[2] Georgia Southern Univ, Allen E Paulson Coll Engn & Informat Technol, Statesboro, GA 30460 USA

来源：

2016 6TH INTERNATIONAL CONFERENCE ON COMPUTERS COMMUNICATIONS AND CONTROL (ICCCC) | 2016年

关键词：

data warehouse; data models; relational/normalized model; data vault model; anchor model; dimensional model; EVOLVING DATA;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The main purpose of data warehouses (DW) is to maintain large volumes of historical data (originating from multiple heterogeneous data sources and representing the different states of a system caused by various business events or activities) in a format that facilitates its analysis in order to support timelier and better decision-making, at both the operational and strategic level. In order for a data warehouse to be able to adequately fulfill this purpose, its data model must enable the appropriate and consistent representation of the different states of a system. In effect, a DW data model, representing the physical structure of the DW, must be general enough, to be able to consume data from heterogeneous data sources (where all of the data should be treated as relevant data and it must be possible to trace it back to its source) and reconcile the semantic differences of the data source models, and, at the same time, be resilient to the constant changes in the structure of the data sources. One of the main problems related to DW development is the absence of a standardized DW data model. In this paper a comparative analysis of the four most prominent DW data models (namely the relational/normalized model, data vault model, anchor model and dimensional model) will be given. These models will be analyzed and compared on the basis of the following criteria: (1) semantics (i.e. the fundamental concepts), (2) resilience of the data model with regard to changes in the structure of the data sources, (3) temporal aspects and (4) completeness and traceability of the data. By identifying the strengths and weaknesses of each of these models it would be possible to establish the foundation for a new DW data model which would more adequately fulfill the posed requirements.

引用

页码：151 / 159

页数：9

共 23 条

[1] [Anonymous], 1998, The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing and Deploying Data Warehouses
[2] Codd E.F., 1969, DERIVABILITY REDUNDA
[3] CODD EF, 1970, COMMUN ACM, V13, P377, DOI 10.1145/357980.358007
[4] Damhof R., 2008, The next generation EDW
[5] Golfarelli M., 2009, Data Warehouse Design: Modern Principles and Methodologies
[6] Schema versioning in data warehouses:: Enabling cross-version querying via schema augmentation
Golfarelli, Matteo
Lechtenboerger, Jens
Rizzi, Stefano
Vossen, Gottfried
[J]. DATA & KNOWLEDGE ENGINEERING, 2006, 59 (02) : 435 - 459
[7] Temporal entity-relationship models - A survey
Gregersen, H
Jensen, CS
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1999, 11 (03) : 464 - 497
[8] Inmon B., 2004, The Single Version of The Truth
[9] INMON Bill, 2010, DATA WAREHOUSING 2 0
[10] Inmon WilliamH., 2002, BUILDING DATA WAREHO, V4th

← 1 2 3 →