Exploring correlation for fast skyline computation

被引:3
|
作者
Yu, Boseon [1 ]
Choi, Wonik [2 ]
Liu, Ling [3 ]
机构
[1] Korea Inst Sci & Technol, Hwarang Ro 14gil 5, Seoul, South Korea
[2] Inha Univ, Sch Informat & Commun Engn, 100 Inharo, Incheon, South Korea
[3] Georgia Inst Technol, Coll Comp, 266 Ferst Dr, Atlanta, GA 30332 USA
关键词
Skyline; Information extraction; Data analysis; Parallel computing; MULTICORE; QUERIES;
D O I
10.1007/s11227-017-2064-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Scaling skyline queries over high-dimensional datasets remains to be challenging due to the fact that most existing algorithms assume dimensional independence when establishing the worst-case complexity by discarding correlation distribution. In this paper, we present HashSkyline, a systematic and correlation-aware approach for scaling skyline queries over high-dimensional datasets with three novel features: First, it offers a fast hash-based method to prune non-skyline points by utilizing data correlation characteristics and speed up the overall skyline evaluation for correlated datasets. Second, we develop , which can dramatically reduce the response time for anti-correlated and independent datasets by capitalizing on the parallel processing power of GPUs. Third, the HashSkyline approach uses the pivot cell-based mechanism combined with the correlation threshold to determine the correlation distribution characteristics for a given dataset, enabling adaptive configuration of HashSkyline for skyline query evaluation by auto-switching of and . We evaluate the validity of HashSkyline using both synthetic datasets and real datasets. Our experiments show that HashSkyline consumes significantly less pre-processing cost and achieves significantly higher overall query performance, compared to existing state-of-the-art algorithms.
引用
收藏
页码:5071 / 5102
页数:32
相关论文
共 50 条
  • [21] Faster output-sensitive skyline computation algorithm
    Liu, Jinfei
    Xiong, Li
    Xu, Xiaofeng
    INFORMATION PROCESSING LETTERS, 2014, 114 (12) : 710 - 713
  • [22] ReSKY: Efficient Subarray Skyline Computation in Array Databases
    Choi, Dalsu
    Yoon, Hyunsik
    Chung, Yon Dohn
    DISTRIBUTED AND PARALLEL DATABASES, 2022, 40 (2-3) : 261 - 298
  • [23] Parallelization of skyline probability computation over uncertain preferences
    Zhu, Haoyang
    Zhu, Peidong
    Li, Xiaoyong
    Liu, Qiang
    Xun, Peng
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (18)
  • [24] Efficient Service Skyline Computation for Composite Service Selection
    Yu, Qi
    Bouguettaya, Athman
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (04) : 776 - 789
  • [25] Efficient skyline computation over distributed interval data
    Li, Xiaoyong
    Ren, Kaijun
    Li, Xiaoling
    Yu, Jie
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (10)
  • [26] Efficient Contour Computation of Group-Based Skyline
    Yu, Wenhui
    Liu, Jinfei
    Pei, Jian
    Xiong, Li
    Chen, Xu
    Qin, Zheng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (07) : 1317 - 1332
  • [27] Parallelizing uncertain skyline computation against n-of-N data streaming model
    Liu, Jun
    Li, Xiaoyong
    Ren, Kaijun
    Song, Junqiang
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (04)
  • [28] Efficient computation of G-Skyline groups on massive data
    Han, Xixian
    Wang, Jinbao
    Li, Jianzhong
    Gao, Hong
    INFORMATION SCIENCES, 2022, 587 : 300 - 322
  • [29] Scalable skyline computation using a balanced pivot selection technique
    Lee, Jongwuk
    Hwang, Seung-won
    INFORMATION SYSTEMS, 2014, 39 : 1 - 21
  • [30] EVALUATION OF COMMUNICATION AND COMPUTATION EFFICIENT ALGORITHMS FOR DISTRIBUTED SKYLINE QUERIES
    Sunitha, T.
    Indu, L.
    2013 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2013, : 201 - 206