Fuzzy Join for Flexible Combining Big Data Lakes in Cyber-Physical Systems

被引:14
作者
Malysiak-Mrozek, Bozena [1 ]
Lipinska, Anna [1 ]
Mrozek, Dariusz [1 ]
机构
[1] Silesian Tech Univ, Inst Informat, PL-44100 Gliwice, Poland
来源
IEEE ACCESS | 2018年 / 6卷
关键词
Cyber-physical systems; big data; fuzzy logic; querying; cloud computing; biomedical data analysis; declarative languages; DATA ANALYTICS; MAPREDUCE; ARCHITECTURE; IMPLEMENTATION; FRAMEWORK;
D O I
10.1109/ACCESS.2018.2879829
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cyber-physical. systems produce large amounts of data that are stored in domain-related data lakes in a variety of formats. By using the big data technologies that enable efficient data processing, the value of the data increases, as these technologies can turn the data into actionable information that influences important decision-making processes. However, a broader view of the operational environment, an investigated phenomena, and challenges related to them can frequently be obtained after combining data from many data sets located in various big data lakes. This requires contact points in both data lakes that must be flexibly joined because in many cases, data sets do not correspond to one another directly. In this paper, we show fuzzy join operation for flexible combining big data lakes. The fuzzy join transforms numerical values of common attributes of joined data sets into fuzzy sets and uses such a representation in the join operation. We propose two variants of the join operation that transforms crisp numerical values of joining attributes into: 1) fuzzy numbers and 2) linguistic terms. The fuzzy join operation is implemented and tested in the declarative U-SQL language that is used for scalable and parallel querying in big data lakes. The ideas presented here are exemplified by a distributed analysis of cardiac disease data on Microsoft Azure cloud. The results of the conducted experiments confirm that the fuzzy join can enrich data sets that are used in making critical decisions and, as a highly scalable cloud-based solution, can be successfully used in processing large volumes of data delivered by cyber-physical systems.
引用
收藏
页码:69545 / 69558
页数:14
相关论文
共 50 条
  • [31] Cyber-Physical Systems
    Letichevsky A.A.
    Letychevskyi O.O.
    Skobelev V.G.
    Volkov V.A.
    Letichevsky, A.A. (aaletichevsky78@gmail.com), 2017, Springer Science and Business Media, LLC (53) : 821 - 834
  • [32] Scalable Uncertainty-Aware Truth Discovery in Big Data Social Sensing Applications for Cyber-Physical Systems
    Huang, Chao
    Wang, Dong
    Chawla, Nitesh V.
    IEEE TRANSACTIONS ON BIG DATA, 2020, 6 (04) : 702 - 713
  • [33] Cloud-Integrated Cyber-Physical Systems for Complex Industrial Applications
    Shu, Zhaogang
    Wan, Jiafu
    Zhang, Daqiang
    Li, Di
    MOBILE NETWORKS & APPLICATIONS, 2016, 21 (05) : 865 - 878
  • [34] Big Data Driven Cyber Physical Systems
    Hahanov, Vladimir
    Miz, Volodymyr
    Litvinova, Eugenia
    Mishchenko, Alexander
    Shcherbin, Dmitry
    PROCEEDINGS OF XIIITH INTERNATIONAL CONFERENCE - EXPERIENCE OF DESIGNING AND APPLICATION OF CAD SYSTEMS IN MICROELECTRONICS CADSM 2015, 2015, : 76 - 80
  • [35] Semantic Cyber-physical Cloud Systems
    Beres, Adela
    2017 5TH INTERNATIONAL SYMPOSIUM ON DIGITAL FORENSIC AND SECURITY (ISDFS), 2017,
  • [36] Context modeling for cyber-physical systems
    Daun, Marian
    Tenbergen, Bastian
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2023, 35 (07)
  • [37] Research on Big Data Processing Model of Edge-Cloud Collaboration in Cyber-Physical Systems
    Yue, Zhifeng
    Zhu, Zhixiang
    Wang, Chuang
    Du, Wenbo
    2020 5TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (IEEE ICBDA 2020), 2020, : 140 - 144
  • [38] Cyber-physical modeling and simulation: A reference architecture for designing demonstrators for industrial cyber-physical systems
    Oks, Sascha Julian
    Jalowski, Max
    Fritzsche, Albrecht
    Moeslein, Kathrin M.
    29TH CIRP DESIGN CONFERENCE 2019, 2019, 84 : 257 - 264
  • [39] An Analytics Toolbox for Cyber-Physical Systems Data Analysis: Requirements and Challenges
    Zanin, M.
    Menasalvas, E.
    Rodriguez Gonzalez, A.
    Smrz, P.
    2020 43RD INTERNATIONAL CONVENTION ON INFORMATION, COMMUNICATION AND ELECTRONIC TECHNOLOGY (MIPRO 2020), 2020, : 271 - 276
  • [40] Data space randomization for securing cyber-physical systems
    Bradley Potteiger
    Feiyang Cai
    Zhenkai Zhang
    Xenofon Koutsoukos
    International Journal of Information Security, 2022, 21 : 597 - 610