DLToDW: Transferring Relational and NoSQL Databases from a Data Lake

被引:0
作者
Jemmali R. [1 ,2 ]
Abdelhedi F. [1 ]
Zurfluh G. [2 ]
机构
[1] CBI2, Trimane, Paris
[2] IRIT CNRS (UMR 5505), Toulouse University, Toulouse
关键词
Big Data; Data Lake; Data Warehouse; MDA; NoSQL; QVT; Relational databases;
D O I
10.1007/s42979-022-01287-7
中图分类号
学科分类号
摘要
Over the past decade, digital transformation has led to the evolution of databases towards Big Data. A need to collect and analyze data from different sources has emerged. At the same time, traditional decision support systems are unable to meet the growing needs of modern businesses to integrate and analyze a wide variety of generated data. As a result, most organizations need to convert their data stored in relational systems to NoSQL or "Not only SQL" systems that are based on flexible models and schemas. Our work is part of a medical application that must allow health professionals to analyze complex data for decision making. We propose mechanisms to extract data from a Data Lake and store them in a NoSQL Data Warehouse. This will allow to perform, in a second time, decisional analysis facilitated by the features offered by NoSQL systems (richness of data structures, query language, access performances). In this article, we present a process for ingesting data from a Data Lake into a Data Warehouse. The ingestion consists, first, in transferring relational and NoSQL DBs extracted from the Data Lake into a single NoSQL DB (the Data Warehouse), second, in merging so-called "similar" classes and third, in converting the links into references between objects. To automate this process, we used the Model Driven Architecture (MDA) which provides a schema transformation environment. From the physical schemas describing a Data Lake, we propose transformation rules that allow to create a Data Warehouse stored under a document-oriented NoSQL system. An experimentation has been performed for a medical application. © 2022, The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd.
引用
收藏
相关论文
共 13 条
  • [1] Couto J., Borges O., Ruiz D.D., Marczak S., Prikladnicki R., A mapping study about Data Lakes: an improved definition and possible architectures, SEKE, (2019)
  • [2] Kuszera E.M., Peres L.M., Fabro M.D.D., Toward RDB to NoSQL: Transforming data with metamorfose framework, Proceedings of the 34Th ACM/SIGAPP Symposium on Applied Computing, pp. 456-463, (2019)
  • [3] Mahmood A.A., Automated algorithm for data migration from relational to NoSQL databases, Al-Nahrain J Eng Sci, 21, pp. 60-65, (2018)
  • [4] Stanescu L., Brezovan M., Burdescu D.D., Automatic Mapping of Mysql Databases to Nosql Mongodb, pp. 837-840, (2016)
  • [5] Liyanaarachchi G., Kasun L., Nimesha M., Lahiru K., Karunasena A., MigDB—relational to NoSQL mapper, 2016 IEEE International Conference on Information and Automation for Sustainability (Iciafs), pp. 1-6, (2016)
  • [6] Mallek H., Ghozzi F., Teste O., Gargouri F., BigDimETL with NoSQL Database, Procedia Comput Sci, 126, pp. 798-807, (2018)
  • [7] Yangui R., Nabli A., Gargouri F., ETL based framework for NoSQL warehousing, Information systems, pp. 40-53, (2017)
  • [8] Wijaya Y.S., Arman A.A., A framework for data migration between different datastore of NoSQL Database, 2018 International Conference on ICT for Smart Society (ICISS), pp. 1-6, (2018)
  • [9] Dabbechi H., Haddar N., Elghazel H., Haddar K., Social Media Data Integration: From Data Lake to Nosql Data Warehouse, pp. 701-710, (2021)
  • [10] Candel C.J.F., Ruiz D.S., Garcia-Molina J.J., A Unified Metamodel for Nosql and Relational Databases, (2021)