A two-phase data space partitioning for efficient skyline computation

被引:4
作者
Nasridinov, Aziz [1 ]
Choi, Jong-Hyeok [1 ]
Park, Young-Ho [2 ]
机构
[1] Chungbuk Natl Univ, Dept Comp Sci, Data Analyt Lab, Cheongju, South Korea
[2] Soomyung Womens Univ, Dept IT Engn, Engn Sch, Seoul, South Korea
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2017年 / 20卷 / 04期
关键词
Data space partitioning; Skyline; Database; LAYER-BASED INDEX; TOP-K QUERIES; CONVEX SKYLINE; GPU;
D O I
10.1007/s10586-017-1070-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The skyline has attracted a lot of attention due to its wide application in various fields. However, the skyline computation is a challenging issue as there is a high probability that today's applications deal with large and high-dimensional data. As skyline computation for such huge amount of data consumes much time, parallel and distributed skyline computations are considered. State-of-the-art methods for parallel and distributed skyline computations use various data space partitioning techniques. However, these methods are not efficient, as in certain cases, these methods perform unnecessary skyline computations in a partitioned space, where local-skyline tuples do not contribute to the global-skyline. This may impose additional processing overload and enlarge the overall skyline computation time. In this paper, we propose a novel data space partitioning method for parallel and distributed skyline computation that consists of two-phases: diagonal and entropy score curve based partitioning. The proposed method produces a small set of local-skyline tuples and leads to a more sophisticated merging step. The experiment results demonstrate that the proposed method reduces the number of comparisons and processing time of skyline computation in large amount of data when compared with the existing state-of-the-art methods.
引用
收藏
页码:3617 / 3628
页数:12
相关论文
共 20 条
[1]   Discovering the Skyline of Web Databases [J].
Asudeh, Abolfazl ;
Thirumuruganathan, Saravanan ;
Zhang, Nan ;
Das, Gautam .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (07) :600-611
[2]  
Bartolini I., 2006, PROC 15 ACM INT C IN, P405, DOI 10.1145/1183614.1183674
[3]   Efficient Sort-Based Skyline Evaluation [J].
Bartolini, Ilaria ;
Ciaccia, Paolo ;
Patella, Marco .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2008, 33 (04)
[4]   SkyAlign: a portable, work-efficient skyline algorithm for multicore and GPU architectures [J].
Bogh, Kenneth S. ;
Chester, Sean ;
Assent, Ira .
VLDB JOURNAL, 2016, 25 (06) :817-841
[5]   Work-Efficient Parallel Skyline Computation for the GPU [J].
Bogh, Kenneth S. ;
Chester, Sean ;
Assent, Ira .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (09) :962-973
[6]   The Skyline operator [J].
Börzsönyi, S ;
Kossmann, D ;
Stocker, K .
17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, :421-430
[7]   An Efficient Computation of Skyline Queries Using Hash Tables [J].
Choi, Jong Hyeok ;
Lee, Jong Yun ;
Shin, HyunSoon ;
Nasridinov, Aziz .
ADVANCED SCIENCE LETTERS, 2016, 22 (09) :2348-2353
[8]   Skyline with presorting [J].
Chomicki, J ;
Godfrey, P ;
Gryz, J ;
Liang, DM .
19TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2003, :717-719
[9]  
Chomicki J., 2013, ACM SIGMOD RECORD, V42
[10]   AVERAGE CASE ANALYSIS OF HEAP BUILDING BY REPEATED INSERTION [J].
HAYWARD, R ;
MCDIARMID, C .
JOURNAL OF ALGORITHMS, 1991, 12 (01) :126-153