Ranking the big sky: efficient top-k skyline computation on massive data

被引:5
作者
Han, Xixian [1 ]
Wang, Bailing [1 ]
Li, Jianzhong [1 ]
Gao, Hong [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Heilongjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Massive data; Top-k skyline; RSTS algorithm; Table scan; Pruning operation; QUERIES; ALGORITHMS; DISTANCE;
D O I
10.1007/s10115-018-1256-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many applications, top-k skyline query is an important operation to return k skyline tuples with the highest domination scores in a potentially huge data space. It is analyzed that the existing algorithms cannot process top-k skyline query on massive data efficiently. In this paper, we propose a novel table-scan-based algorithm RSTS to compute top-k skyline results on massive data efficiently. RSTS first builds the presorted table, whose tuples are arranged in the order of round-robin retrieval on sorted column lists. RSTS consists of two phases. In phase 1, the candidate tuples are acquired by the sequential scan on the presorted table. In phase 2, RSTS calculates the domination scores of the candidates and returns query results by another sequential scan. It is proved that RSTS has the characteristic of early termination, along with the theoretical analysis of scan depths. The pruning rule for candidate tuples is devised in this paper. The theoretical pruning effect shows that majority of the skyline results can be discarded directly. The extensive experimental results, conducted on synthetic and real-life data sets, show that RSTS outperforms the existing algorithms significantly.
引用
收藏
页码:415 / 446
页数:32
相关论文
共 37 条
[1]   Discovering the Skyline of Web Databases [J].
Asudeh, Abolfazl ;
Thirumuruganathan, Saravanan ;
Zhang, Nan ;
Das, Gautam .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (07) :600-611
[2]   The Skyline operator [J].
Börzsönyi, S ;
Kossmann, D ;
Stocker, K .
17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, :421-430
[3]  
Chan Chee-Yong., 2006, PROC ACM SPECIAL INT, P503
[4]  
Chang CF, 2006, SOHN INTERNATIONAL SYMPOSIUM ADVANCED PROCESSING OF METALS AND MATERIALS, VOL 4, P447
[5]   Neural skyline filter for accelerating skyline search algorithms [J].
Chen, Yi-Chung ;
Lee, Chiang .
EXPERT SYSTEMS, 2015, 32 (01) :108-131
[6]   Skyline with presorting [J].
Chomicki, J ;
Godfrey, P ;
Gryz, J ;
Liang, DM .
19TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2003, :717-719
[7]  
Das Sarma A, 2011, PROC INT CONF DATA, P387, DOI 10.1109/ICDE.2011.5767873
[8]   Optimal aggregation algorithms for middleware [J].
Fagin, R ;
Lotem, A ;
Naor, M .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (04) :614-656
[9]  
Fernandez R, 2015, CIDR 2015
[10]   Efficient algorithms for finding the most desirable skyline objects [J].
Gao, Yunjun ;
Liu, Qing ;
Chen, Lu ;
Chen, Gang ;
Li, Qing .
KNOWLEDGE-BASED SYSTEMS, 2015, 89 :250-264