An Automated Metadata Generation Method for Data Lake of Industrial WoT Applications

被引:5
|
作者
Yu, Han [1 ]
Cai, Hongming [1 ]
Liu, Zhiyuan [1 ]
Xu, Boyi [2 ]
Jiang, Lihong [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Software, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Coll Econ & Management, Shanghai 200052, Peoples R China
来源
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2022年 / 52卷 / 08期
基金
中国国家自然科学基金;
关键词
Metadata; Semantics; Runtime; Data mining; Ontologies; Text recognition; Conferences; Data lake (DL); data modeling; entity recognition; metadata generation; stream processing; Web of Things (WoT); ACQUISITION; EXTRACTION;
D O I
10.1109/TSMC.2021.3119871
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent trends in the Web of Things (WoT) have led to data explosion. Data lake (DL), as a flexible on-demand heterogeneous data management architecture, has become a feasible solution in data management. Metadata modeling for DLs is the key basis for smart analysis and processing. However, the varieties in structures and semantics of industrial WoT data hinder metadata modeling and maintenance. Moreover, the lack of textual descriptions and the semantics hidden in value streams make it hard to automatically construct semantic metadata. The dynamic nature of WoT requires on-time evolution on metadata. To overcome these challenges, we propose an automated bottom-up metadata generation approach for DL of WoT applications. Applying a data-driven framework, raw data are notated as linked data and self-organizing map-based online clustering is applied to real timely extract data characteristics. To recognize entities, concepts and relations, semantics-based entity discovery approach from short texts is proposed according to the feature of WoT data. The numerical analysis is performed to find the hidden relations from raw values. Full-dimensional metadata with rich semantic knowledge are finally built. Experiments on a real-world dataset are conducted to verify the effectiveness of methods and a case study on an energy WoT system is provided to demonstrate the feasibility of the approach.
引用
收藏
页码:5235 / 5248
页数:14
相关论文
共 50 条
  • [41] A novel method for detecting lake ice cover using optical satellite data
    Heinila, Kirsikka
    Mattila, Olli-Pekka
    Metsamaki, Sari
    Vakeva, Sakari
    Luojus, Kari
    Schwaizer, Gabriele
    Koponen, Sampsa
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2021, 104
  • [42] TwinXR: Method for using digital twin descriptions in industrial eXtended reality applications
    Tu, Xinyi
    Autiosalo, Juuso
    Ala-Laurinaho, Riku
    Yang, Chao
    Salminen, Pauli
    Tammi, Kari
    FRONTIERS IN VIRTUAL REALITY, 2023, 4
  • [43] A Hybrid Feature Selection Method for Effective Data Classification in Data Mining Applications
    Sangaiya, Ilangovan
    Kumar, A. Vincent Antony
    INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2019, 11 (01) : 1 - 16
  • [44] An Automatic Software Behavior Model Generation Method for Industrial Cyber-Physical System
    Sun, Weiqi
    Dai, Wenbin
    2020 IEEE 18TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), VOL 1, 2020, : 897 - 902
  • [45] A Data-mining Method to Assess Automatic Generation Control Performance of Power Generation Units
    Yang, Zijiang
    Wang, Jiandong
    Gao, Song
    Pang, Xiangkun
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 159 - 164
  • [46] Fully automated generation of parametric BIM for MEP scenes based on terrestrial laser scanning data
    Wang, Boyu
    Yin, Chao
    Luo, Han
    Cheng, Jack C. P.
    Wang, Qian
    AUTOMATION IN CONSTRUCTION, 2021, 125
  • [47] Automated generation of photochemical reaction data by transient flow experiments coupled with online HPLC analysis
    Haas, Christian P.
    Biesenroth, Simon
    Buckenmaier, Stephan
    van de Goor, Tom
    Tallarek, Ulrich
    REACTION CHEMISTRY & ENGINEERING, 2020, 5 (05) : 912 - 920
  • [48] Automated Hashtag Hierarchy Generation Using Community Detection and the Shannon Diversity Index, with Applications to Twitter and Parler
    Torene, Spencer
    Follmann, Andrew
    Teague, Thomas
    Chang, Peter
    Howald, Blake
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2022, 16 (04) : 473 - 496
  • [49] Method of data anomaly detection in the process of mobile applications installation
    Polhul, Tetiana D.
    Yarovyi, Andrii A.
    Romaniuk, Ryszard
    Komada, Pawel
    Askarova, Nursanat
    PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2019, 2019, 11176
  • [50] Automated Data Model Generation From Textual Specifications: A Case Study of ECHONET Lite Specification
    Pham, Van Cu
    Linh, Nguyen Thi Dieu
    Le, Tung
    Nguyen, Tien Huy
    Tan, Yasuo
    IEEE ACCESS, 2023, 11 : 138316 - 138324