Exploring correlation for fast skyline computation

被引:3
作者
Yu, Boseon [1 ]
Choi, Wonik [2 ]
Liu, Ling [3 ]
机构
[1] Korea Inst Sci & Technol, Hwarang Ro 14gil 5, Seoul, South Korea
[2] Inha Univ, Sch Informat & Commun Engn, 100 Inharo, Incheon, South Korea
[3] Georgia Inst Technol, Coll Comp, 266 Ferst Dr, Atlanta, GA 30332 USA
关键词
Skyline; Information extraction; Data analysis; Parallel computing; MULTICORE; QUERIES;
D O I
10.1007/s11227-017-2064-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Scaling skyline queries over high-dimensional datasets remains to be challenging due to the fact that most existing algorithms assume dimensional independence when establishing the worst-case complexity by discarding correlation distribution. In this paper, we present HashSkyline, a systematic and correlation-aware approach for scaling skyline queries over high-dimensional datasets with three novel features: First, it offers a fast hash-based method to prune non-skyline points by utilizing data correlation characteristics and speed up the overall skyline evaluation for correlated datasets. Second, we develop , which can dramatically reduce the response time for anti-correlated and independent datasets by capitalizing on the parallel processing power of GPUs. Third, the HashSkyline approach uses the pivot cell-based mechanism combined with the correlation threshold to determine the correlation distribution characteristics for a given dataset, enabling adaptive configuration of HashSkyline for skyline query evaluation by auto-switching of and . We evaluate the validity of HashSkyline using both synthetic datasets and real datasets. Our experiments show that HashSkyline consumes significantly less pre-processing cost and achieves significantly higher overall query performance, compared to existing state-of-the-art algorithms.
引用
收藏
页码:5071 / 5102
页数:32
相关论文
共 50 条
  • [41] Scalable Skyline Computation Using Object-based Space Partitioning
    Zhang, Shiming
    Mamoulis, Nikos
    Cheung, David W.
    ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 483 - 494
  • [42] Efficient and privacy-preserving skyline computation framework across domains
    Liu, Ximeng
    Lu, Rongxing
    Ma, Jianfeng
    Chen, Le
    Bao, Haiyong
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2016, 62 : 161 - 174
  • [43] Probabilistic n-of-N skyline computation over uncertain data streams
    Zhang, Wenjie
    Li, Aiping
    Cheema, Muhammad Aamir
    Zhang, Ying
    Chang, Lijun
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2015, 18 (05): : 1331 - 1350
  • [44] Towards Progressive and Load Balancing Distributed Computation: A Case Study on Skyline Analysis
    Huang, Jin
    Zhao, Feng
    Chen, Jian
    Pei, Jian
    Yin, Jian
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (03) : 431 - 443
  • [45] Parallelization of group-based skyline computation for multi-core processors
    Zhu, Haoyang
    Zhu, Peidong
    Li, Xiaoyong
    Liu, Qiang
    Xun, Peng
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (18)
  • [46] Towards Progressive and Load Balancing Distributed Computation: A Case Study on Skyline Analysis
    Jin Huang
    Feng Zhao
    Jian Chen
    Jian Pei
    Jian Yin
    Journal of Computer Science and Technology, 2010, 25 : 431 - 443
  • [47] Towards Progressive and Load Balancing Distributed Computation:A Case Study on Skyline Analysis
    黄晋
    赵丰
    陈健
    裴健
    印鉴
    JournalofComputerScience&Technology, 2010, 25 (03) : 431 - 443
  • [48] Probabilistic n-of-N skyline computation over uncertain data streams
    Wenjie Zhang
    Aiping Li
    Muhammad Aamir Cheema
    Ying Zhang
    Lijun Chang
    World Wide Web, 2015, 18 : 1331 - 1350
  • [49] Ranking the big sky: efficient top-k skyline computation on massive data
    Han, Xixian
    Wang, Bailing
    Li, Jianzhong
    Gao, Hong
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 60 (01) : 415 - 446
  • [50] Ranking the big sky: efficient top-k skyline computation on massive data
    Xixian Han
    Bailing Wang
    Jianzhong Li
    Hong Gao
    Knowledge and Information Systems, 2019, 60 : 415 - 446