Ontology-based Framework for Integration of Time Series Data: Application in Predictive Analytics on Data Center Monitoring Metrics

被引:0
作者
Tuovinen, Lauri [1 ]
Suutala, Jaakko [1 ]
机构
[1] Univ Oulu, Biomimet & Intelligent Syst Grp, Oulu, Finland
来源
PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KEOD), VOL 2 | 2021年
关键词
Data Integration; Data Analytics; Time Series Data; Data Center; Domain Ontology; Software Framework;
D O I
10.5220/0010650300003064
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monitoring a large and complex system such as a data center generates many time series of metric data, which are often stored using a database system specifically designed for managing time series data. Different, possibly distributed, databases may be used to collect data representing different aspects of the system, which complicates matters when, for example, developing data analytics applications that require integrating data from two or more of these. From the developer's point of view, it would be highly convenient if all of the required data were available in a single database, but it may well be that the different databases do not even implement the same query language. To address this problem, we propose using an ontology to capture the semantic similarities among different time series database systems and to hide their syntactic differences. Alongside the ontology, we have developed a Python software framework that enables the developer to build and execute queries using classes and properties defined by the ontology. The ontology thus effectively specifies a semantic query language that can be used to retrieve data from any of the supported database systems, and the Python framework can be set up to treat the different databases as a single data store that can be queried using this semantic language. This is demonstrated by presenting an application involving predictive analytics on resource usage and electricity consumption metrics gathered from a Kubernetes cluster, stored in Prometheus and KairosDB databases, but the framework can be extended in various ways and adapted to different use cases, enabling machine learning research using distributed heterogeneous data sources.
引用
收藏
页码:151 / 161
页数:11
相关论文
共 16 条
[1]   Container description ontology for CaaS [J].
Boukadi, Khouloud ;
Rekik, Molka ;
Bernabe, Jorge Bernal ;
Lloret, Jaime .
INTERNATIONAL JOURNAL OF WEB AND GRID SERVICES, 2020, 16 (04) :341-363
[2]   FOrTE: A Federated Ontology and Timeseries query Engine [J].
El Kaed, Charbel ;
Boujonnier, Matthieu .
2017 IEEE INTERNATIONAL CONFERENCE ON INTERNET OF THINGS (ITHINGS) AND IEEE GREEN COMPUTING AND COMMUNICATIONS (GREENCOM) AND IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING (CPSCOM) AND IEEE SMART DATA (SMARTDATA), 2017, :983-990
[3]  
Hebert A, 2019, TSL DEV FRIENDLY TIM
[4]  
Hossayni H, 2018, 2018 GLOBAL INTERNET OF THINGS SUMMIT (GIOTS), P191
[5]  
Kairos DB, 2021, KAIROSDB DOCUMENTATI
[6]  
Koorapati K., 2020, J COMPUT THEOR NANOS, V17, P479
[7]  
Kubernetes, 2021, WHAT IS KUBERNETES
[8]   Owlready: Ontology-oriented programming in Python']Python with automatic classification and high level constructs for biomedical ontologies [J].
Lamy, Jean-Baptiste .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2017, 80 :11-28
[9]   A Data Center Simulation Framework Based on an Ontological Foundation [J].
Memari, Ammar ;
Vornberger, Jan ;
Gomez, Jorge Marx ;
Nebel, Wolfgang .
ADVANCES AND NEW TRENDS IN ENVIRONMENTAL AND ENERGY INFORMATICS, 2016, :39-57
[10]  
Metwally KM, 2015, CONSUM COMM NETWORK, P790, DOI 10.1109/CCNC.2015.7158078