Warehousing complex data from the web

被引:9
作者
Boussaïd, O. [1 ,2 ,3 ,4 ]
Darmont, J. [1 ,5 ]
Bentayeb, F. [1 ,4 ,5 ]
Loudcher, S. [1 ,4 ,6 ,7 ]
机构
[1] ERIC, University of Lyon 2, 69676 Bron Cedex
[2] School of Economics and Management
[3] Department of Statistics and Computer Science
关键词
Complex data; Complex data ETL process; Complex data warehouse; Data warehousing; Data web; Decision Support System; DSS; OLAP; Online analytical Processing; X-warehousing; XML cube; XML warehousing;
D O I
10.1504/IJWET.2008.019942
中图分类号
学科分类号
摘要
Data warehousing and Online Analytical Processing (OLAP) technologies are now moving onto handling complex data that mostly originate from the web. However, integrating such data into a decision-support process requires their representation in a form processable by OLAP and/or data mining techniques. We present in this paper a complex data warehousing methodology that exploits extensible Markup Language (XML) as a pivot language. Our approach includes the integration of complex data in an ODS, in the form of XML documents; their dimensional modelling and storage in an XML data warehouse; and their analysis with combined OLAP and data mining techniques. We also address the crucial issue of performance in XML warehouses. Copyright © 2008 Inderscience Enterprises Ltd.
引用
收藏
页码:408 / 433
页数:25
相关论文
共 59 条
  • [1] Agrawal R., Imielinski T., Swami A., Mining association rules between sets of items in large databases, ACM SIGMOD International Conference on Management of Data (SIGMOD, 93, (1993)
  • [2] Aouiche K., Jouve P.E., Darmont J., Clustering-based materialized view selection in data warehouses, LNCS, 4152, pp. 81-95, (2006)
  • [3] Baril X., Bellahsene Z., Designing and managing an XML warehouse, XML Data Management: Native XML and XML-enabled Database Systems, pp. 455-473, (2003)
  • [4] SMAIDoC download page, (2003)
  • [5] Belanger D., Churchet K., Hume A., Virtual data warehousing, data publishing, and call detail, LNCS, 1819, pp. 106-117, (1999)
  • [6] Ben Messaoud R., Boussaid O., Loudcher Rabaseda S., A new OLAP aggregation based on the AHC technique, 7th ACM International Workshop on Data Warehousing and OLAP (DOLAP 04), pp. 65-72, (2004)
  • [7] Ben Messaoud R., Boussaid O., Loudcher Rabaseda S., A data mining-based OLAP aggregation of complex data: Application on XML documents, International Journal of Data Warehousing and Mining, 2, 4, pp. 1-26, (2006)
  • [8] Bertino E., Catania B., Wang W.Q., XJoin index: Indexing XML data for efficient handling of branching path expressions, 20th International Conference on Data Engineering (ICDE 04), (2004)
  • [9] Boulos J., Karakashian S., A new design for a native XML storage and indexing manager, LNCS, 3896, pp. 755-772, (2006)
  • [10] Boussaid O., Ben Messaoud R., Choquet R., Anthoard S., X-Warehousing: An XML-based approach for warehousing complex data, LNCS, 4152, pp. 39-54, (2006)