Logical big data integration and near real-time data analytics

被引:6
|
作者
Silva, Bruno [1 ]
Moreira, Jose [1 ,2 ]
Costa, Rogerio Luis de C. [3 ]
机构
[1] Univ Aveiro, Inst Elect & Informat Engn IEETA, LASI, P-3810193 Aveiro, Portugal
[2] Univ Aveiro, Dept Elect Telecommun & Informat DETI, P-3810193 Aveiro, Portugal
[3] Polytech Leiria, Comp Sci & Commun Res Ctr CIIC, P-2411901 Leiria, Portugal
关键词
Big data integration; Distributed databases; Near real-time OLAP;
D O I
10.1016/j.datak.2023.102185
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the context of decision-making, there is a growing demand for near real-time data that traditional solutions, like data warehousing based on long-running ETL processes, cannot fully meet. On the other hand, existing logical data integration solutions are challenging because users must focus on data location and distribution details rather than on data analytics and decision-making. EasyBDI is an open-source system that provides logical integration of data and high-level business-oriented abstractions. It uses schema matching, integration, and mapping techniques, to automatically identify partitioned data and propose a global schema. Users can then specify star schemas based on global entities and submit analytical queries to retrieve data from distributed data sources without knowing the organization and other technical details of the underlying systems. This work presents the algorithms and methods for global schema creation and query execution. Experimental results show that the overhead imposed by logical integration layers is relatively small compared to the execution times of distributed queries.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Towards Real-Time Road Traffiic Analytics using Telco Big Data
    Costa, Constantinos
    Chatzimilioudis, Georgios
    Zeinalipour-Yazti, Demetrios
    Mokbel, Mohamed F.
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL WORKSHOP ON REAL-TIME BUSINESS INTELLIGENCE AND ANALYTICS, 2017,
  • [32] The growing role of integrated and insightful big and real-time data analytics platforms
    Ranganathan, Indrakumari
    Thangamuthu, Poongodi
    Palanimuthu, Suresh
    Balusamy, Balamurugan
    DIGITAL TWIN PARADIGM FOR SMARTER SYSTEMS AND ENVIRONMENTS: THE INDUSTRY USE CASES, 2020, 117 : 165 - 186
  • [33] Real-Time or Near Real-Time Persisting Daily Healthcare Data Into HDFS and ElasticSearch Index Inside a Big Data Platform
    Chen, Dequan
    Chen, Yi
    Brownlow, Brian N.
    Kanjamala, Pradip P.
    Arredondo, Carlos A. Garcia
    Radspinner, Bryan L.
    Raveling, Matthew A.
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2017, 13 (02) : 595 - 606
  • [34] A methodology for real-time data sustainability in smart city: Towards inferencing and analytics for big-data
    Malik, Kaleem Razzaq
    Sam, Yacine
    Hussain, Majid
    Abuarqoub, Abdelrahman
    SUSTAINABLE CITIES AND SOCIETY, 2018, 39 : 548 - 556
  • [35] A survey on data stream, big data and real-time
    Gomes E.H.A.
    Plentz P.D.M.
    De Rolt C.R.
    Dantas M.A.R.
    International Journal of Networking and Virtual Organisations, 2019, 20 (02) : 143 - 167
  • [36] Near real-time analysis of big fusion data on HPC systems
    Kube, Ralph
    Churchill, R. Michael
    Choi, Jong
    Wang, Ruonan
    Choi, Minjun
    Klasky, Scott
    Chang, C. S.
    PROCEEDINGS OF URGENTHPC 2020: THE IEEE/ACM INTERNATIONAL WORKSHOPS ON URGENT AND INTERACTIVE HPC, 2020, : 55 - 63
  • [37] REAL-TIME BIG DATA ANALYTICS FRAMEWORK WITH DATA BLENDING APPROACH FOR MULTIPLE DATA SOURCES IN SMART CITY APPLICATIONS
    Manjunatha, S.
    Annappa, B.
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2020, 21 (04): : 611 - 623
  • [38] Real-Time Data Analytics: An Algorithmic Perspective
    Morshed, Sarwar Jahan
    Rana, Juwel
    Milrad, Marcelo
    DATA MINING AND BIG DATA, DMBD 2016, 2016, 9714 : 311 - 320
  • [39] Real-Time Clickstream Data Analytics and Visualization
    Hanamanthrao, Ramanna
    Thejaswini, S.
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ELECTRONICS, INFORMATION & COMMUNICATION TECHNOLOGY (RTEICT), 2017, : 2139 - 2144
  • [40] A Streamlined Approach for Real-Time Data Analytics
    Arora, Shruti
    Rani, Rinkle
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2018, : 732 - 736