Exploring correlation for fast skyline computation

被引:3
|
作者
Yu, Boseon [1 ]
Choi, Wonik [2 ]
Liu, Ling [3 ]
机构
[1] Korea Inst Sci & Technol, Hwarang Ro 14gil 5, Seoul, South Korea
[2] Inha Univ, Sch Informat & Commun Engn, 100 Inharo, Incheon, South Korea
[3] Georgia Inst Technol, Coll Comp, 266 Ferst Dr, Atlanta, GA 30332 USA
关键词
Skyline; Information extraction; Data analysis; Parallel computing; MULTICORE; QUERIES;
D O I
10.1007/s11227-017-2064-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Scaling skyline queries over high-dimensional datasets remains to be challenging due to the fact that most existing algorithms assume dimensional independence when establishing the worst-case complexity by discarding correlation distribution. In this paper, we present HashSkyline, a systematic and correlation-aware approach for scaling skyline queries over high-dimensional datasets with three novel features: First, it offers a fast hash-based method to prune non-skyline points by utilizing data correlation characteristics and speed up the overall skyline evaluation for correlated datasets. Second, we develop , which can dramatically reduce the response time for anti-correlated and independent datasets by capitalizing on the parallel processing power of GPUs. Third, the HashSkyline approach uses the pivot cell-based mechanism combined with the correlation threshold to determine the correlation distribution characteristics for a given dataset, enabling adaptive configuration of HashSkyline for skyline query evaluation by auto-switching of and . We evaluate the validity of HashSkyline using both synthetic datasets and real datasets. Our experiments show that HashSkyline consumes significantly less pre-processing cost and achieves significantly higher overall query performance, compared to existing state-of-the-art algorithms.
引用
收藏
页码:5071 / 5102
页数:32
相关论文
共 50 条
  • [1] Parallel Skyline Computation Exploiting the Lattice Structure
    Endres, Markus
    Kiessling, Werner
    JOURNAL OF DATABASE MANAGEMENT, 2015, 26 (04) : 18 - 43
  • [2] SkyFlow: Heterogeneous streaming for skyline computation using FlowGraph and SYCL
    Carlos Romero, Jose
    Navarro, Angeles
    Rodriguez, Andres
    Asenjo, Rafael
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 141 : 269 - 283
  • [3] High Parallel Skyline Computation over Low-Cardinality Domains
    Endres, Markus
    Kiessling, Werner
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS (ADBIS 2014), 2014, 8716 : 97 - 111
  • [4] Efficient Computation of G-Skyline Groups
    Wang, Changping
    Wang, Chaokun
    Guo, Gaoyang
    Ye, Xiaojun
    Yu, Philip S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (04) : 674 - 688
  • [5] Indexing for progressive skyline computation
    Eng, PK
    Ooi, BC
    Tan, KL
    DATA & KNOWLEDGE ENGINEERING, 2003, 46 (02) : 169 - 201
  • [6] Efficient continuous skyline computation
    Morse, M.
    Patel, J. M.
    Grosky, W. I.
    INFORMATION SCIENCES, 2007, 177 (17) : 3411 - 3437
  • [7] Skyline Computation with Noisy Comparisons
    Groz, Benoit
    Mallmann-Trenn, Frederik
    Mathieu, Claire
    Verdugo, Victor
    COMBINATORIAL ALGORITHMS, IWOCA 2020, 2020, 12126 : 289 - 303
  • [8] Skyline Computation for Big Data
    Kulkarni, R. D.
    Momin, B. F.
    DATA SCIENCE AND BIG DATA ANALYTICS, 2019, 16 : 267 - 276
  • [9] Real-Time Skyline Computation on Data Streams
    Rudenko, Lena
    Endres, Markus
    NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2018, 2018, 909 : 20 - 28
  • [10] Efficient Top-k Skyline Computation in MapReduce
    Song, Baoyan
    Liu, Aili
    Ding, Linlin
    2015 12TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA), 2015, : 67 - 70