Towards a Semantic Extract-Transform-Load (ETL) framework for Big Data Integration

被引:56
|
作者
Bansal, Srividya K. [1 ]
机构
[1] Arizona State Univ, Dept Engn & Comp Syst, Mesa, AZ 85212 USA
来源
2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS) | 2014年
关键词
Big data; Data integration; Ontology; Semantic technolgies; DESIGN;
D O I
10.1109/BigData.Congress.2014.82
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big Data has become the new ubiquitous term used to describe massive collection of datasets that are difficult to process using traditional database and software techniques. Most of this data is inaccessible to users, as we need technology and tools to find, transform, analyze, and visualize data in order to make it consumable for decision-making. One aspect of Big Data research is dealing with the Variety of data that includes various formats such as structured, numeric, unstructured text data, email, video, audio, stock ticker, etc. Managing, merging, and governing a variety of data is the focus of this paper. This paper proposes a semantic Extract-Transform-Load (ETL) framework that uses semantic technologies to integrate and publish data from multiple sources as open linked data. This includes - creation of a semantic data model to provide a basis for integration and understanding of knowledge from multiple sources; creation of a distributed Web of data using Resource Description Framework (RDF) as the graph data model; extraction of useful knowledge and information from the combined data using SPARQL as the semantic query language.
引用
收藏
页码:521 / 528
页数:8
相关论文
共 50 条
  • [1] SAT-ETL-Integrator: an extract-transform-load software for satellite big data ingestion
    Boudriki Semlali, Badr-Eddine
    El Amrani, Chaker
    Ortiz, Guadalupe
    JOURNAL OF APPLIED REMOTE SENSING, 2020, 14 (01)
  • [2] A Survey of Extract-Transform-Load Technology
    Vassiliadis, Panos
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2009, 5 (03) : 1 - 27
  • [3] Data integration from traditional to big data: main features and comparisons of ETL approaches
    Walha, Afef
    Ghozzi, Faiza
    Gargouri, Faiez
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (19) : 26687 - 26725
  • [4] Simplifying Extract-Transform-Load for Ranked Hierarchical Trees via Mapping Specifications
    Soomro, Sarfaraz
    Matsunaga, Andrea
    Fortes, Jose A. B.
    FORMALISMS FOR REUSE AND SYSTEMS INTEGRATION, 2015, 346 : 203 - 225
  • [5] Generalized Big Data Test Framework for ETL Migration
    Sharma, Kunal
    Attar, Vahida
    2016 INTERNATIONAL CONFERENCE ON COMPUTING, ANALYTICS AND SECURITY TRENDS (CAST), 2016, : 528 - 532
  • [6] On-demand big data integration: A hybrid ETL approach for reproducible scientific research
    Kathiravelu, Pradeeban
    Sharma, Ashish
    Galhardas, Helena
    Van Roy, Peter
    Veiga, Luis
    DISTRIBUTED AND PARALLEL DATABASES, 2019, 37 (02) : 273 - 295
  • [7] Towards a conceptualization of ETL and physical storage of semantic data warehouses as a service
    Berkani, Nabila
    Bellatreche, Ladjel
    Khouri, Selma
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2013, 16 (04): : 915 - 931
  • [8] Big Data Integration: A Semantic Mediation Architecture Using Summary
    Aggoune, Aicha
    Bouramoul, Abdelkrim
    Kholladi, Mohamed-Khiereddine
    2016 2ND INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2016, : 21 - 25
  • [9] A RESTful and semantic framework for data integration
    Fuentes-Lorenzo, Damaris
    Sanchez, Luis
    Cuadra, Antonio
    Cutanda, Mar
    SOFTWARE-PRACTICE & EXPERIENCE, 2015, 45 (09) : 1161 - 1188
  • [10] Significance of Data Integration and ETL in Business Intelligence Framework for Higher Education
    Rodzi, Nur Alia Hamizah Mohamad
    Othman, Mohd Shahizan
    Yusuf, Lizawati Mi
    2015 International Conference on Science in Information Technology (ICSITech), 2015, : 181 - 186