Metadata management in a big data infrastructure

被引:2
作者
Holom, Roxana-Maria [1 ]
Rafetseder, Katharina [1 ]
Kritzinger, Stefanie [1 ]
Sehrschoen, Herald [2 ]
机构
[1] RISC Software GmbH, Softwarepk 35, A-4232 Hagenberg, Austria
[2] Fill Gesell mbH, Fillstr 1, A-4232 Gurten, Austria
来源
INTERNATIONAL CONFERENCE ON INDUSTRY 4.0 AND SMART MANUFACTURING (ISM 2019) | 2020年 / 42卷
基金
欧盟地平线“2020”;
关键词
big data; metadata; data harmonization; linked data; semantics; machine learning;
D O I
10.1016/j.promfg.2020.02.060
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The adoption of the Internet of Things (IoT) in industry provides the opportunity to gather valuable data. Nevertheless, this amount of data must be analyzed to identify patterns in the data, model behaviors of equipment and to enable prediction. Although big data found its initiation already some years ago, there are still many challenges to be solved, e.g. metadata representation and management are still a research topic. The big data architecture of the RISC data analytics framework relies on the combination of big data technologies with semantic approaches, to process and store large volumes of data from heterogeneous sources, provided by FILL, which is a key machine tool provider. The proposed architecture is capable of handling sensor data using big data technologies such as Spark on Hadoop, InfluxDB and Elasticsearch. The metadata representation and management approach is adopted in order to define the structure and the relations (i.e., the connections) between the various data sources provided by the sensors and logging information system. On the other hand, using a metadata approach in our big data environment enhances RISC data analytics framework by making it generic, reusable and responsive in case of changes, thus keeping the data lakes up-to-date and ensuring the validity of the analytics results. The work presented here is part of an ongoing project (BOOST 4.0) currently addressed under the EU H2020 program. (C) 2020 The Authors. Published by Elsevier B.V.
引用
收藏
页码:375 / 382
页数:8
相关论文
共 33 条
  • [1] Amstutz P., SEMANTIC ANNOTATIONS
  • [2] [Anonymous], AP HAD
  • [3] [Anonymous], SPARQL QUERY LANGUAG
  • [4] Apache Software Foundation, Apache Kafka.
  • [5] Apache Software Foundation, Apache Spark
  • [6] BOOST4.0ProjectPartners, BOOST 4 0 BIG DAT FA
  • [7] CambridgeSemantics, SEM LAYER HAD
  • [8] Big Data Semantics
    Ceravolo, Paolo
    Azzini, Antonia
    Angelini, Marco
    Catarci, Tiziana
    Cudre-Mauroux, Philippe
    Damiani, Ernesto
    Mazak, Alexandra
    Van Keulen, Maurice
    Jarrar, Mustafa
    Santucci, Giuseppe
    Sattler, Kai-Uwe
    Scannapieco, Monica
    Wimmer, Manuel
    Wrembel, Robert
    Zaraket, Fadi
    [J]. JOURNAL ON DATA SEMANTICS, 2018, 7 (02) : 65 - 85
  • [9] SciData: a data model and ontology for semantic representation of scientific data
    Chalk, Stuart J.
    [J]. JOURNAL OF CHEMINFORMATICS, 2016, 8
  • [10] What are ontologies, and why do we need them?
    Chandrasekaran, B
    Josephson, JR
    Benjamins, VR
    [J]. IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1999, 14 (01): : 20 - 26