An Approach to Extracting Topic-guided Views from the Sources of a Data Lake

被引:12
作者
Diannantini, Claudia [1 ]
Lo Giudice, Paolo [2 ]
Potena, Domenico [1 ]
Storti, Emanuele [1 ]
Ursino, Domenico [1 ]
机构
[1] Polytech Univ Marche, DII, Ancona, Italy
[2] Univ Mediterranea Reggio Calabria, DIIES, Reggio Di Calabria, Italy
关键词
Data lakes; Unstructuted data sources; Metadata management; Thematic views; Semantic similarities; DBpedia; LINKED DATA; INFORMATION; INTEGRATION; QUERIES; CONSTRUCTION; SYSTEM; DIKE;
D O I
10.1007/s10796-020-10010-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the last years, data lakes are emerging as an effective and an efficient support for information and knowledge extraction from a huge amount of highly heterogeneous and quickly changing data sources. Data lake management requires the definition of new techniques, very different from the ones adopted for data warehouses in the past. In this scenario, one of the most challenging issues to address consists in the extraction of topic-guided (i.e., thematic) views from the (very heterogeneous and often unstructured) sources of a data lake. In this paper, we propose a new network-based model to uniformly represent structured, semi-structured and unstructured sources of a data lake. Then, we present a new approach to, at least partially, "structuring" unstructured data. Finally, we define a technique to extract topic-guided views from the sources of a data lake, based on similarity and other semantic relationships among source metadata.
引用
收藏
页码:243 / 262
页数:20
相关论文
共 57 条
  • [51] Spink A, 2001, J AM SOC INF SCI TEC, V52, P226, DOI 10.1002/1097-4571(2000)9999:9999<::AID-ASI1591>3.3.CO
  • [52] 2-I
  • [53] Tsvetovat M., 2011, Social network analysis for startups
  • [54] Wang J, 2011, P INT C EXT DAT TECH, P153
  • [55] Revisiting Answering Tree Pattern Queries Using Views
    Wang, Junhu
    Yu, Jeffrey Xu
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2012, 37 (03):
  • [56] Wu X., 2009, The Conference on Information and Knowledge Management, P475
  • [57] An energy-adaptive MPPT power management unit for micro-power vibration energy harvesting
    Yi, Jun
    Su, Feng
    Lam, Yat-Hei
    Ki, Wing-Hung
    Tsui, Chi-Ying
    [J]. PROCEEDINGS OF 2008 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-10, 2008, : 2570 - 2573