Probabilistic skylines on uncertain data: model and bounding-pruning-refining methods

被引:17
作者
Jiang, Bin [1 ]
Pei, Jian [1 ]
Lin, Xuemin [2 ,3 ]
Yuan, Yidong [2 ,3 ]
机构
[1] Simon Fraser Univ, Sch Comp Sci, Burnaby, BC V5A 1S6, Canada
[2] Univ New S Wales, Sch Comp Sci & Engn, Sydney, NSW, Australia
[3] NICTA, Sydney, NSW, Australia
基金
澳大利亚研究理事会; 加拿大自然科学与工程研究理事会;
关键词
Uncertain data; Skyline queries; Probabilistic queries; Algorithms; COMPUTATION; MAXIMA; SET;
D O I
10.1007/s10844-010-0141-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Uncertain data are inherent in some important applications. Although a considerable amount of research has been dedicated to modeling uncertain data and answering some types of queries on uncertain data, how to conduct advanced analysis on uncertain data remains an open problem at large. In this paper, we tackle the problem of skyline analysis on uncertain data. We propose a novel probabilistic skyline model where an uncertain object may take a probability to be in the skyline, and a p-skyline contains all objects whose skyline probabilities are at least p (0 < p a parts per thousand currency signaEuro parts per thousand 1). Computing probabilistic skylines on large uncertain data sets is challenging. We develop a bounding-pruning-refining framework and three algorithms systematically. The bottom-up algorithm computes the skyline probabilities of some selected instances of uncertain objects, and uses those instances to prune other instances and uncertain objects effectively. The top-down algorithm recursively partitions the instances of uncertain objects into subsets, and prunes subsets and objects aggressively. Combining the advantages of the bottom-up algorithm and the top-down algorithm, we develop a hybrid algorithm to further improve the performance. Our experimental results on both the real NBA player data set and the benchmark synthetic data sets show that probabilistic skylines are interesting and useful, and our algorithms are efficient on large data sets.
引用
收藏
页码:1 / 39
页数:39
相关论文
共 3 条
  • [1] Probabilistic skylines on uncertain data: model and bounding-pruning-refining methods
    Bin Jiang
    Jian Pei
    Xuemin Lin
    Yidong Yuan
    Journal of Intelligent Information Systems, 2012, 38 : 1 - 39
  • [2] Probabilistic object deputy model for uncertain data and lineage management
    Wang, Liang
    Wang, Liwei
    Peng, Zhiyong
    DATA & KNOWLEDGE ENGINEERING, 2017, 109 : 70 - 84
  • [3] A PROBABILISTIC MODEL MAPPING-BASED EVENT DETECTION APPROACH OVER UNCERTAIN SENSOR DATA
    Chen, Mo
    Yu, Ge
    Li, Chuanwen
    Hong, Bonghee
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE 2009), VOLS 1 AND 2, 2009, : 1615 - 1621