Managing Heterogeneous Data on a Big Data Platform: A Multi-Criteria Decision Making Model for Data-Intensive Science

被引:7
作者
Pal, Gautam [1 ]
Atkinson, Katie [1 ]
Li, Gangmin [2 ]
机构
[1] Univ Liverpool, Dept Comp Sci, Liverpool, Merseyside, England
[2] Xian Jiaotong Liverpool Univ, Dept Comp Sci, Suzhou, Peoples R China
来源
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2020) | 2020年
关键词
Multi-criteria decision making. Multi-agent systems; 3 Vs of big data; Fuzzy graph; NoSQL databases;
D O I
10.1109/BigComp48618.2020.00-69
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper presents an approach to solving data variety problem of big data through an offline and online decision-making system. We present a graph-based approach to imitate real-world problem domain with a set of criteria and problem solvers. We introduce a Multi-criteria decision-making model to select a set of problem solvers that meets the set of criteria most. Suppose a system is processing Twitter data that comes as a stream of JSON records from multiple data sources. The decision system determines which of the available methods to use for a list of requirements (criteria). When multiple criteria (must meet requirements) coexist in a problem domain, their order of importance against the criteria, the mutual influence on each other and level of indispensability forms a graphic structure. In the proposed model, we consider each vertex of the graph as a criterion or benefit of an agent against the criterion. The mutual influence of multiple agents is denoted by the connecting edges of the graph. We also proposed a fuzzy graph framework to model real-world unpredictability. The model produces benchmarking results for each of the problem solvers in terms of absolute values to support decision making. The model is implemented through TopBread, Resource Description Framework (RDF), and RDF Data Query Language (RDQL). The key advantage of the proposed model over the existing ones is that the framework can operate in a dual mode both as a standalone offline tool and as an online decision-making gateway, it can also be used in high-velocity ingestion scenarios.
引用
收藏
页码:229 / 239
页数:11
相关论文
共 26 条
  • [1] [Anonymous], 2016, IEEE T SMART GRID
  • [2] BOU S., 2018, CBIX INCREMENTAL SLI
  • [3] Ferreira Teixeira InesCatarina., 2019, Event-driven real-time streaming approach for big data, applied to an end-to-end supply chain
  • [4] Network Games
    Galeotti, Andrea
    Goyal, Sanjeev
    Jackson, Matthew O.
    Vega-Redondo, Fernando
    Yariv, Leeat
    [J]. REVIEW OF ECONOMIC STUDIES, 2010, 77 (01) : 218 - 244
  • [5] Hausenblas M., 2015, Lambda architecture, V6, P2014
  • [6] Kiran M, 2015, PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, P2785, DOI 10.1109/BigData.2015.7364082
  • [7] Klapsing R., 2001, IEEE Multimedia, V8, P62, DOI 10.1109/93.917972
  • [8] Semantics-Enabled Framework for Spatial Image Information Mining of Linked Earth Observation Data
    Kurte, Kuldeep R.
    Durbha, Surya S.
    King, Roger L.
    Younan, Nicolas H.
    Vatsavai, Rangaraju
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (01) : 29 - 44
  • [9] Leblebicioglu M. K., 2017 25 SIGN PROC CO, P1
  • [10] Lee Y, 2013, ACM SIGCOMM COMP COM, V43, P6