Quarry: A User-centered Big Data Integration Platform

被引:0
作者
Petar Jovanovic
Sergi Nadal
Oscar Romero
Alberto Abelló
Besim Bilalli
机构
[1] Universitat Politècnica de Catalunya (BarcelonaTech),
来源
Information Systems Frontiers | 2021年 / 23卷
关键词
Data Integration; Big Data; Data-Intensive Flows; Metadata;
D O I
暂无
中图分类号
学科分类号
摘要
Obtaining valuable insights and actionable knowledge from data requires cross-analysis of domain data typically coming from various sources. Doing so, inevitably imposes burdensome processes of unifying different data formats, discovering integration paths, and all this given specific analytical needs of a data analyst. Along with large volumes of data, the variety of formats, data models, and semantics drastically contribute to the complexity of such processes. Although there have been many attempts to automate various processes along the Big Data pipeline, no unified platforms accessible by users without technical skills (like statisticians or business analysts) have been proposed. In this paper, we present a Big Data integration platform (Quarry) that uses hypergraph-based metadata to facilitate (and largely automate) the integration of domain data coming from a variety of sources, and provides an intuitive interface to assist end users both in: (1) data exploration with the goal of discovering potentially relevant analysis facets, and (2) consolidation and deployment of data flows which integrate the data, and prepare them for further analysis (descriptive or predictive), visualization, and/or publishing. We validate Quarry’s functionalities with the use case of World Health Organization (WHO) epidemiologists and data analysts in their fight against Neglected Tropical Diseases (NTDs).
引用
收藏
页码:9 / 33
页数:24
相关论文
共 50 条
  • [31] Research of Big Data Processing Platform
    Liu, Xiangju
    GREEN POWER, MATERIALS AND MANUFACTURING TECHNOLOGY AND APPLICATIONS III, PTS 1 AND 2, 2014, 484-485 : 922 - 926
  • [32] Data Management Platform for Simplifying Application Integration by Using Context of Data
    Moribe H.
    Koizumi M.
    Igarashi Y.
    Tsuno S.
    Tajima Y.
    IEEJ Transactions on Electronics, Information and Systems, 2021, 141 (12) : 1462 - 1471
  • [33] Social wireless network user big data mining based on Python platform and hierarchical clustering computing
    Wang K.
    Liang X.
    International Journal of Networking and Virtual Organisations, 2021, 25 (01) : 62 - 82
  • [34] Secure Sensitive Data Sharing on a Big Data Platform
    Xinhua Dong
    Ruixuan Li
    Heng He
    Wanwan Zhou
    Zhengyuan Xue
    Hao Wu
    TsinghuaScienceandTechnology, 2015, 20 (01) : 72 - 80
  • [35] ADACOP: A Big Data Platform for Open Government Data
    Moreno, Andres
    Molano-Pulido, Jose
    Gomez-Morantes, Juan E.
    Gonzalez, Rafael A.
    PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON THEORY AND PRACTICE OF ELECTRONIC GOVERNANCE, ICEGOV 2022, 2022, : 369 - 375
  • [36] A Big Data platform for smart meter data analytics
    Wilcox, Tom
    Jin, Nanlin
    Flach, Peter
    Thumim, Joshua
    COMPUTERS IN INDUSTRY, 2019, 105 : 250 - 259
  • [37] Secure Sensitive Data Sharing on a Big Data Platform
    Dong, Xinhua
    Li, Ruixuan
    He, Heng
    Zhou, Wanwan
    Xue, Zhengyuan
    Wu, Hao
    TSINGHUA SCIENCE AND TECHNOLOGY, 2015, 20 (01) : 72 - 80
  • [38] Data quality-based view selection in big data integration system
    Anter S.
    International Journal of Business Intelligence and Data Mining, 2023, 23 (03) : 264 - 276
  • [39] Methodology of Big Data Integration from A Priori Unknown Heterogeneous Data Sources
    Samoylov, Alexey
    Sergeev, Nikolay
    Kucherova, Margarita
    Denisov, Boris
    PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 131 - 135
  • [40] An Integration of Big Data and Cloud Computing
    Thingom, Chintureena
    Yeon, Guydeuk
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 2, 2017, 469 : 729 - 737