Efficient computation of G-Skyline groups on massive data

被引:5
作者
Han, Xixian [1 ]
Wang, Jinbao [1 ]
Li, Jianzhong [1 ]
Gao, Hong [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Peoples R China
关键词
G; -Skyline; Sublinear-I; O method; Reuse principle; Massive data; OPTIMIZATION; NETWORKS;
D O I
10.1016/j.ins.2021.12.028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many practical applications, G-Skyline query is an important operation to return the best tuple groups, which are not g-dominated by other tuple groups of the same size, from a potentially huge data space. It is found that the existing G-Skyline algorithms cannot deal well with massive data due to high I/O cost and high computation cost. This paper proposes a novel GPR algorithm, which is based on presorting and reuse principle, to compute G -Skyline groups on massive data efficiently. The execution of GPR consists of two phases: acquisition of the candidate tuples and computation of G-Skyline groups. The sublinear-I/O method is proposed in phase 1 to scan the presorted table, which is proved to hold early termination property. This paper devises the basic framework of phase 2 and analyzes its execution cost. The SR strategy is utilized to reuse the subset computation results effec-tively and reduce the execution cost of phase 2 considerably. The extensive experimental results, conducted on synthetic and real-life data sets, show that GPR outperforms the existing algorithms significantly.(c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:300 / 322
页数:23
相关论文
共 41 条
[1]  
[Anonymous], 2009, Introduction to Algorithms
[2]   Skyline with presorting [J].
Chomicki, J ;
Godfrey, P ;
Gryz, J ;
Liang, DM .
19TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2003, :717-719
[3]   Effective and efficient skyline query processing over attribute-order-preserving-free encrypted data in cloud-enabled databases [J].
Cuzzocrea, Alfredo ;
Karras, Panagiotis ;
Vlachou, Akrivi .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 126 :237-251
[4]   Accelerating Skycube Computation with Partial and Parallel Processing for Service Selection [J].
Dong, Fang ;
Luo, Junzhou ;
Jin, Jiahui ;
Shi, Jiyuan ;
Yang, Ye ;
Shen, Jun .
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2020, 13 (06) :969-984
[5]   A Gradient-Based Search Method for Multi-objective Optimization Problems [J].
Gao, Weifeng ;
Wang, Yiming ;
Liu, Lingling ;
Huang, Lingling .
INFORMATION SCIENCES, 2021, 578 :129-146
[6]  
Godfrey P, 2004, LECT NOTES COMPUT SC, V2942, P78
[7]   Algorithms and analyses for maximal vector computation [J].
Godfrey, Parke ;
Shipley, Ryan ;
Gryz, Jarek .
VLDB JOURNAL, 2007, 16 (01) :5-28
[8]   Efficient Skyline Computation on Big Data [J].
Han, Xixian ;
Li, Jianzhong ;
Yang, Donghua ;
Wang, Jinbao .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (11) :2521-2535
[9]   An efficient Pareto-based feature selection algorithm for multi-label classification [J].
Hashemi, Amin ;
Dowlatshahi, Mohammad Bagher ;
Nezamabadi-pour, Hossein .
INFORMATION SCIENCES, 2021, 581 :428-447
[10]   Group skyline computation [J].
Im, Hyeonseung ;
Park, Sungwoo .
INFORMATION SCIENCES, 2012, 188 :151-169