Towards a Semantic Extract-Transform-Load (ETL) framework for Big Data Integration

被引:56
|
作者
Bansal, Srividya K. [1 ]
机构
[1] Arizona State Univ, Dept Engn & Comp Syst, Mesa, AZ 85212 USA
来源
2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS) | 2014年
关键词
Big data; Data integration; Ontology; Semantic technolgies; DESIGN;
D O I
10.1109/BigData.Congress.2014.82
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big Data has become the new ubiquitous term used to describe massive collection of datasets that are difficult to process using traditional database and software techniques. Most of this data is inaccessible to users, as we need technology and tools to find, transform, analyze, and visualize data in order to make it consumable for decision-making. One aspect of Big Data research is dealing with the Variety of data that includes various formats such as structured, numeric, unstructured text data, email, video, audio, stock ticker, etc. Managing, merging, and governing a variety of data is the focus of this paper. This paper proposes a semantic Extract-Transform-Load (ETL) framework that uses semantic technologies to integrate and publish data from multiple sources as open linked data. This includes - creation of a semantic data model to provide a basis for integration and understanding of knowledge from multiple sources; creation of a distributed Web of data using Resource Description Framework (RDF) as the graph data model; extraction of useful knowledge and information from the combined data using SPARQL as the semantic query language.
引用
收藏
页码:521 / 528
页数:8
相关论文
共 50 条
  • [41] Towards Big Data Security Framework by Leveraging Fragmentation and Blockchain Technology
    Alhazmi, Hanan E.
    Eassa, Fathy E.
    Sandokji, Suhelah M.
    IEEE ACCESS, 2022, 10 : 10768 - 10782
  • [42] Towards an Intelligent Framework for Scientific Computational Steering in Big Data Systems
    Zhang, Yijie
    Wu, Chase Q.
    2024 IEEE 24TH INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID 2024, 2024, : 671 - 675
  • [43] Towards an IoT Big Data Analytics Framework: Smart Buildings Systems
    Bashir, Muhammad Rizwan
    Gill, Asif Qumer
    PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2016, : 1325 - 1332
  • [44] Ocean knowledge representation through integration of big data employing semantic web technologies
    Anitha Velu
    Menakadevi Thangavelu
    Earth Science Informatics, 2022, 15 : 1563 - 1585
  • [45] Towards a new scalable big data system semantic web applied on mobile learning
    Banane M.
    Belangour A.
    International Journal of Interactive Mobile Technologies, 2020, 14 (01) : 126 - 140
  • [46] A Framework for Enhancing Big Data Integration in Biological Domain Using Distributed Processing
    Almasoud, Ameera
    Al-Khalifa, Hend
    Al-salman, AbdulMalik
    Lytras, Miltiadis
    APPLIED SCIENCES-BASEL, 2020, 10 (20): : 1 - 16
  • [47] An Integration Framework on Cloud for Cyber-Physical-Social Systems Big Data
    Kuang, Liwei
    Yang, Laurence T.
    Liao, Yang
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2020, 8 (02) : 363 - 374
  • [48] Towards the Integration of Agricultural Data from Heterogeneous Sources: Perspectives for the French Agricultural Context Using Semantic Technologies
    Jiang, Shufan
    Angarita, Rafael
    Chiky, Raja
    Cormier, Stephane
    Rousseaux, Francis
    ADVANCED INFORMATION SYSTEMS ENGINEERING WORKSHOPS, 2020, 382 : 89 - 94
  • [49] Research on Load Curve Clustering of Distribution Transformer Based on Wavelet Transform and Big Data Processing
    Yuan, Shaoguang
    Zhang, Xiaofei
    Geng, Juncheng
    Wan, Diming
    PROCEEDINGS OF 2017 IEEE 2ND INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2017, : 348 - 351
  • [50] An ontology-guided semantic data integration framework to support integrative data analysis of cancer survival
    Zhang, Hansi
    Guo, Yi
    Li, Qian
    George, Thomas J.
    Shenkman, Elizabeth
    Modave, Francois
    Bian, Jiang
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2018, 18