Quarry: A User-centered Big Data Integration Platform

被引:0
|
作者
Petar Jovanovic
Sergi Nadal
Oscar Romero
Alberto Abelló
Besim Bilalli
机构
[1] Universitat Politècnica de Catalunya (BarcelonaTech),
来源
Information Systems Frontiers | 2021年 / 23卷
关键词
Data Integration; Big Data; Data-Intensive Flows; Metadata;
D O I
暂无
中图分类号
学科分类号
摘要
Obtaining valuable insights and actionable knowledge from data requires cross-analysis of domain data typically coming from various sources. Doing so, inevitably imposes burdensome processes of unifying different data formats, discovering integration paths, and all this given specific analytical needs of a data analyst. Along with large volumes of data, the variety of formats, data models, and semantics drastically contribute to the complexity of such processes. Although there have been many attempts to automate various processes along the Big Data pipeline, no unified platforms accessible by users without technical skills (like statisticians or business analysts) have been proposed. In this paper, we present a Big Data integration platform (Quarry) that uses hypergraph-based metadata to facilitate (and largely automate) the integration of domain data coming from a variety of sources, and provides an intuitive interface to assist end users both in: (1) data exploration with the goal of discovering potentially relevant analysis facets, and (2) consolidation and deployment of data flows which integrate the data, and prepare them for further analysis (descriptive or predictive), visualization, and/or publishing. We validate Quarry’s functionalities with the use case of World Health Organization (WHO) epidemiologists and data analysts in their fight against Neglected Tropical Diseases (NTDs).
引用
收藏
页码:9 / 33
页数:24
相关论文
共 50 条
  • [21] Big Data Health Care Platform With Multisource Heterogeneous Data Integration and Massive High-Dimensional Data Governance for Large Hospitals: Design, Development, and Application
    Wang, Miye
    Li, Sheyu
    Zheng, Tao
    Li, Nan
    Shi, Qingke
    Zhuo, Xuejun
    Ding, Renxin
    Huang, Yong
    JMIR MEDICAL INFORMATICS, 2022, 10 (04) : 196 - 210
  • [22] Review of Big Data Integration in Construction Industry Digitalization
    Yousif, Omar Sedeeq
    Zakaria, Rozana Binti
    Aminudin, Eeydzah
    Yahya, Khairulzan
    Sam, Abdul Rahman Mohd
    Singaram, Loganathan
    Munikanan, Vikneswaran
    Yahya, Muhamad Azani
    Wahi, Noraziah
    Shamsuddin, Siti Mazzuana
    FRONTIERS IN BUILT ENVIRONMENT, 2021, 7
  • [23] SemLinker: automating big data integration for casual users
    Alrehamy H.
    Walker C.
    Journal of Big Data, 5 (1)
  • [24] Big Data Reduction Using RBFNN: A Predictive Model for ECG Waveform for eHealth platform integration
    Pombo, Nuno
    Garcia, Nuno
    Felizardo, Virginie
    Bousson, Kouamana
    2014 IEEE 16TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM), 2014, : 66 - 70
  • [25] Challenges of Data Integration and Interoperability in Big Data
    Kadadi, Anirudh
    Agrawal, Rajeev
    Nyamful, Christopher
    Atiq, Rahman
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [26] The Stratosphere platform for big data analytics
    Alexander Alexandrov
    Rico Bergmann
    Stephan Ewen
    Johann-Christoph Freytag
    Fabian Hueske
    Arvid Heise
    Odej Kao
    Marcus Leich
    Ulf Leser
    Volker Markl
    Felix Naumann
    Mathias Peters
    Astrid Rheinländer
    Matthias J. Sax
    Sebastian Schelter
    Mareike Höger
    Kostas Tzoumas
    Daniel Warneke
    The VLDB Journal, 2014, 23 : 939 - 964
  • [27] The Stratosphere platform for big data analytics
    Alexandrov, Alexander
    Bergmann, Rico
    Ewen, Stephan
    Freytag, Johann-Christoph
    Hueske, Fabian
    Heise, Arvid
    Kao, Odej
    Leich, Marcus
    Leser, Ulf
    Markl, Volker
    Naumann, Felix
    Peters, Mathias
    Rheinlaender, Astrid
    Sax, Matthias J.
    Schelter, Sebastian
    Hoeger, Mareike
    Tzoumas, Kostas
    Warneke, Daniel
    VLDB JOURNAL, 2014, 23 (06) : 939 - 964
  • [28] Big Data Platform for Educational Analytics
    Munshi, Amr A.
    Alhindi, Ahmad
    IEEE ACCESS, 2021, 9 : 52883 - 52890
  • [29] A Distributed Recommendation Platform for Big Data
    Valcarce, Daniel
    Parapar, Javier
    Barreiro, Alvaro
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2015, 21 (13) : 1810 - 1829
  • [30] Research of Big Data Processing Platform
    Liu, Xiangju
    GREEN POWER, MATERIALS AND MANUFACTURING TECHNOLOGY AND APPLICATIONS III, PTS 1 AND 2, 2014, 484-485 : 922 - 926