Parallel computation of probabilistic skyline queries using MapReduce

被引:3
作者
Gavagsaz, Elaheh [1 ]
机构
[1] Eqbal Lahoori Inst Higher Educ, Mashhad, Razavi Khorasan, Iran
关键词
Probabilistic skyline query; Uncertain data; Multidimensional database; Parallel computation; SKEWED DATA; BIG DATA; ALGORITHMS;
D O I
10.1007/s11227-020-03279-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, numerous applications have been continuously generating large amounts of uncertain data. The advanced analysis queries such as skyline operators are essential topics to extract interesting objects from the vast uncertain dataset. Recently, the MapReduce system has been widely used in the area of big data analysis. Although the probabilistic skyline query is not decomposable, it does not make sense to implement the probabilistic skyline query in the MapReduce framework. This paper proposes an effective parallel method called parallel computation of probabilistic skyline query (PCPS) that can measure the probabilistic skyline set in one MapReduce computation pass. The proposed method takes into account the critical sections and detects data with a high probability of existence through a proposed smart sampling algorithm. PCPS implements a new approach to the fair allocation of input data. The experimental results indicate that our proposed approach can not only reduce the processing time of the probabilistic skyline queries, but also achieve fair precision with varying dimensionality degrees.
引用
收藏
页码:418 / 444
页数:27
相关论文
共 47 条
  • [1] Parallel Skyline Queries
    Afrati, Foto N.
    Koutris, Paraschos
    Suciu, Dan
    Ullman, Jeffrey D.
    [J]. THEORY OF COMPUTING SYSTEMS, 2015, 57 (04) : 1008 - 1037
  • [2] Research on Big Data - A systematic mapping study
    Akoka, Jacky
    Comyn-Wattiau, Isabelle
    Laoufi, Nabil
    [J]. COMPUTER STANDARDS & INTERFACES, 2017, 54 : 105 - 115
  • [3] Addressing barriers to big data
    Alharthi, Abdulkhaliq
    Krotov, Vlad
    Bowman, Michael
    [J]. BUSINESS HORIZONS, 2017, 60 (03) : 285 - 292
  • [4] Handling big data: research challenges and future directions
    Anagnostopoulos, I.
    Zeadally, S.
    Exposito, E.
    [J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (04) : 1494 - 1516
  • [5] Computing All Skyline Probabilities for Uncertain Data
    Atallah, Mikhail J.
    Qi, Yinian
    [J]. PODS'09: PROCEEDINGS OF THE TWENTY-EIGHTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2009, : 279 - 287
  • [6] Bartolini Ilaria., 2006, Proceedings of the 2006 ACM International Conference on Information and Knowledge Management (CIKM), P405
  • [7] The Skyline operator
    Börzsönyi, S
    Kossmann, D
    Stocker, K
    [J]. 17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, : 421 - 430
  • [8] Skyline with presorting: Theory and optimizations
    Chomicki, J
    Godfrey, P
    Gryz, J
    Liang, DM
    [J]. Intelligent Information Processing and Web Mining, Proceedings, 2005, : 595 - 604
  • [9] Approximate aggregation techniques for sensor databases
    Considine, J
    Li, FF
    Kollios, G
    Byers, J
    [J]. 20TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2004, : 449 - 460
  • [10] Cosgaya-Lozano Adan., 2007, 21st Annual International Symposium on High Performance Computing Systems and Applications (HPCS 2007), 13-16 May 2007, Saskatoon, Saskatchewan, Canada, page, P12