Why is the snowflake schema a good data warehouse design?

被引:64
作者
Levene, M [1 ]
Loizou, G [1 ]
机构
[1] Univ London Birkbeck Coll, Sch Comp Sci & Informat Syst, London WC1E 7HX, England
关键词
data warehouse design; star and snowflake schema; independent and separable database schema; acyclic database schema;
D O I
10.1016/S0306-4379(02)00021-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Database design for data warehouses is based on the notion of the snowflake schema and its important special case, the star schema. The snowflake schema represents a dimensional model which is composed of a central fact table and a set of constituent dimension tables which can be further broken up into subdimension tables. We formalise the concept of a snowflake schema in terms of an acyclic database schema whose join tree satisfies certain structural properties. We then define a normal form for snowflake schemas which captures its intuitive meaning with respect to a set of functional and inclusion dependencies. We show that snowflake schemas in this normal form are independent as well as separable when the relation schemas are pairwise incomparable. This implies that relations in the data warehouse can be updated independently of each other as long as referential integrity is maintained. In addition, we show that a data warehouse in snowflake normal form can be queried by joining the relation over the fact table with the relations over its dimension and subdimension tables. We also examine an information-theoretic interpretation of the snowflake schema and show that the redundancy of the primary key of the fact table is zero. (C) 2002 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:225 / 240
页数:16
相关论文
共 30 条
[1]  
[Anonymous], 1998, DATA WAREHOUSE LIFEC
[2]  
[Anonymous], 2000, The Data Webhouse Toolkit
[3]   INDEPENDENT DATABASE SCHEMES UNDER FUNCTIONAL AND INCLUSION DEPENDENCIES [J].
ATZENI, P ;
CHAN, EPF .
ACTA INFORMATICA, 1991, 28 (08) :777-799
[4]   ON THE DESIRABILITY OF ACYCLIC DATABASE SCHEMES [J].
BEERI, C ;
FAGIN, R ;
MAIER, D ;
YANNAKAKIS, M .
JOURNAL OF THE ACM, 1983, 30 (03) :479-513
[5]  
Beeri C., 1979, ACM Transactions on Database Systems, V4, P30, DOI 10.1145/320064.320066
[6]  
Buckley F., 1990, Distance in Graphs
[7]  
Casanova M.A., 1983, P 2 ACM SIGACT SIGMO, P36
[8]  
Cavallo R., 1987, Proceedings of the Thirteenth International Conference on Very Large Data Bases: 1987 13th VLDB, P71
[9]   INDEPENDENT AND SEPARABLE DATABASE SCHEMES [J].
CHAN, EPF ;
MENDELZON, AO .
SIAM JOURNAL ON COMPUTING, 1987, 16 (05) :841-851
[10]  
Chaudhuri S., 1997, SIGMOD Record, V26, P65, DOI 10.1145/248603.248616