Efficient computation of G-Skyline groups on massive data

被引:5
作者
Han, Xixian [1 ]
Wang, Jinbao [1 ]
Li, Jianzhong [1 ]
Gao, Hong [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Peoples R China
关键词
G; -Skyline; Sublinear-I; O method; Reuse principle; Massive data; OPTIMIZATION; NETWORKS;
D O I
10.1016/j.ins.2021.12.028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many practical applications, G-Skyline query is an important operation to return the best tuple groups, which are not g-dominated by other tuple groups of the same size, from a potentially huge data space. It is found that the existing G-Skyline algorithms cannot deal well with massive data due to high I/O cost and high computation cost. This paper proposes a novel GPR algorithm, which is based on presorting and reuse principle, to compute G -Skyline groups on massive data efficiently. The execution of GPR consists of two phases: acquisition of the candidate tuples and computation of G-Skyline groups. The sublinear-I/O method is proposed in phase 1 to scan the presorted table, which is proved to hold early termination property. This paper devises the basic framework of phase 2 and analyzes its execution cost. The SR strategy is utilized to reuse the subset computation results effec-tively and reduce the execution cost of phase 2 considerably. The extensive experimental results, conducted on synthetic and real-life data sets, show that GPR outperforms the existing algorithms significantly.(c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:300 / 322
页数:23
相关论文
共 41 条
[41]   A dual-population algorithm based on alternative evolution and degeneration for solving constrained multi-objective optimization problems [J].
Zou, Juan ;
Sun, Ruiqing ;
Yang, Shengxiang ;
Zheng, Jinhua .
INFORMATION SCIENCES, 2021, 579 :89-102