LShape Partitioning: Parallel Skyline Query Processing Using MapReduce

被引:3
作者
Wijayanto, Heri [1 ]
Wang, Wenlu [2 ]
Ku, Wei-Shinn [2 ]
Chen, Arbee L. P. [1 ]
机构
[1] Asia Univ, Dept Comp Sci & Informat Engn, Taichung 41354, Taiwan
[2] Auburn Univ, Dept Comp Sci & Software Engn, Auburn, AL 36849 USA
基金
美国国家科学基金会;
关键词
Skyline query; parallel computing; partitioning strategy; MapReduce; COMPUTATION; PRODUCTS;
D O I
10.1109/TKDE.2020.3021470
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A skyline query searches the data points that are not dominated by others in the dataset. It is widely adopted for many applications which require multi-criteria decision making. However, skyline query processing is considerably time-consuming for a high-dimensional large scale dataset. Parallel computing techniques are therefore needed to address this challenge, among which MapReduce is one of the most popular frameworks to process big data. A great number of efficient MapReduce skyline algorithms have been proposed in the literature and most of their designs focus on partitioning and pruning the given dataset. However, there are still opportunities for further parallelism. In this study, we propose two parallel skyline processing algorithms using a novel LShape partitioning strategy and an effective Propagation Filtering method. These two algorithms are 2Phase LShape and 1Phase LShape, used for multiple reducers and single reducer, respectively. By extensive experiments, we verify that our algorithms outperformed the state-of-the-art approaches, especially for high-dimensional large scale datasets.
引用
收藏
页码:3363 / 3376
页数:14
相关论文
共 28 条
[1]   Parameterized neural networks for high-energy physics [J].
Baldi, Pierre ;
Cranmer, Kyle ;
Faucett, Taylor ;
Sadowski, Peter ;
Whiteson, Daniel .
EUROPEAN PHYSICAL JOURNAL C, 2016, 76 (05)
[2]   The Skyline operator [J].
Börzsönyi, S ;
Kossmann, D ;
Stocker, K .
17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, :421-430
[3]   MapReduce Skyline Query Processing with A New Angular Partitioning Approach [J].
Chen, Liang ;
Hwang, Kai ;
Wu, Jian .
2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, :2262-2270
[4]   Skyline with presorting [J].
Chomicki, J ;
Godfrey, P ;
Gryz, J ;
Liang, DM .
19TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2003, :717-719
[5]  
Dellis E., 2007, Proceedings of the 33rd international conference on Very large data bases, P291
[6]  
Kian-Lee Tan, 2001, Proceedings of the 27th International Conference on Very Large Data Bases, P301
[7]   MapReduce skyline query processing with partitioning and distributed dominance tests [J].
Koh, Jia-Ling ;
Chen, Chia-Ching ;
Chan, Chih-Yu ;
Chen, Arbee L. P. .
INFORMATION SCIENCES, 2017, 375 :114-137
[8]  
Koh JL, 2014, VLDB J, V23, P541, DOI 10.1007/s00778-013-0336-8
[9]  
Lee Ken C. K., 2007, P 33 INT C VER LARG, P279
[10]   Determining k-Most Demanding Products with Maximum Expected Number of Total Customers [J].
Lin, Chen-Yi ;
Koh, Jia-Ling ;
Chen, Arbee L. P. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (08) :1732-1747