Exploring correlation for fast skyline computation

被引：3

作者：

Yu, Boseon ^{[1
]}

Choi, Wonik ^{[2
]}

Liu, Ling ^{[3
]}

机构：

[1] Korea Inst Sci & Technol, Hwarang Ro 14gil 5, Seoul, South Korea

[2] Inha Univ, Sch Informat & Commun Engn, 100 Inharo, Incheon, South Korea

[3] Georgia Inst Technol, Coll Comp, 266 Ferst Dr, Atlanta, GA 30332 USA

来源：

JOURNAL OF SUPERCOMPUTING | 2017年 / 73卷 / 11期

关键词：

Skyline; Information extraction; Data analysis; Parallel computing; MULTICORE; QUERIES;

D O I：

10.1007/s11227-017-2064-0

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Scaling skyline queries over high-dimensional datasets remains to be challenging due to the fact that most existing algorithms assume dimensional independence when establishing the worst-case complexity by discarding correlation distribution. In this paper, we present HashSkyline, a systematic and correlation-aware approach for scaling skyline queries over high-dimensional datasets with three novel features: First, it offers a fast hash-based method to prune non-skyline points by utilizing data correlation characteristics and speed up the overall skyline evaluation for correlated datasets. Second, we develop , which can dramatically reduce the response time for anti-correlated and independent datasets by capitalizing on the parallel processing power of GPUs. Third, the HashSkyline approach uses the pivot cell-based mechanism combined with the correlation threshold to determine the correlation distribution characteristics for a given dataset, enabling adaptive configuration of HashSkyline for skyline query evaluation by auto-switching of and . We evaluate the validity of HashSkyline using both synthetic datasets and real datasets. Our experiments show that HashSkyline consumes significantly less pre-processing cost and achieves significantly higher overall query performance, compared to existing state-of-the-art algorithms.

引用

页码：5071 / 5102

页数：32

共 50 条

[21] Faster output-sensitive skyline computation algorithm
Liu, Jinfei
Xiong, Li
Xu, Xiaofeng
INFORMATION PROCESSING LETTERS, 2014, 114 (12) : 710 - 713
[22] ReSKY: Efficient Subarray Skyline Computation in Array Databases
Choi, Dalsu
Yoon, Hyunsik
Chung, Yon Dohn
DISTRIBUTED AND PARALLEL DATABASES, 2022, 40 (2-3) : 261 - 298
[23] Parallelization of skyline probability computation over uncertain preferences
Zhu, Haoyang
Zhu, Peidong
Li, Xiaoyong
Liu, Qiang
Xun, Peng
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (18)
[24] Efficient Service Skyline Computation for Composite Service Selection
Yu, Qi
Bouguettaya, Athman
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (04) : 776 - 789
[25] Efficient skyline computation over distributed interval data
Li, Xiaoyong
Ren, Kaijun
Li, Xiaoling
Yu, Jie
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (10)
[26] Efficient Contour Computation of Group-Based Skyline
Yu, Wenhui
Liu, Jinfei
Pei, Jian
Xiong, Li
Chen, Xu
Qin, Zheng
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (07) : 1317 - 1332
[27] Parallelizing uncertain skyline computation against n-of-N data streaming model
Liu, Jun
Li, Xiaoyong
Ren, Kaijun
Song, Junqiang
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (04)
[28] Efficient computation of G-Skyline groups on massive data
Han, Xixian
Wang, Jinbao
Li, Jianzhong
Gao, Hong
INFORMATION SCIENCES, 2022, 587 : 300 - 322
[29] Scalable skyline computation using a balanced pivot selection technique
Lee, Jongwuk
Hwang, Seung-won
INFORMATION SYSTEMS, 2014, 39 : 1 - 21
[30] EVALUATION OF COMMUNICATION AND COMPUTATION EFFICIENT ALGORITHMS FOR DISTRIBUTED SKYLINE QUERIES
Sunitha, T.
Indu, L.
2013 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2013, : 201 - 206

← 1 2 3 4 5 →