Algorithms and analyses for maximal vector computation

被引:116
作者
Godfrey, Parke [1 ]
Shipley, Ryan
Gryz, Jarek
机构
[1] York Univ, Toronto, ON M3J 1PE, Canada
[2] Coll William & Mary, Williamsburg, VA 23187 USA
关键词
D O I
10.1007/s00778-006-0029-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The maximal vector problem is to identify the maximals over a collection of vectors. This arises in many contexts and, as such, has been well studied. The problem recently gained renewed attention with skyline queries for relational databases and with work to develop skyline algorithms that are external and relationally well behaved. While many algorithms have been proposed, how they perform has been unclear. We study the performance of, and design choices behind, these algorithms. We prove runtime bounds based on the number of vectors n and the dimensionality k. Early algorithms based on divide and conquer established seemingly good average and worst-case asymptotic run-times. In fact, the problem can be solved in O(n) average-case ( holding k as fixed). We prove, however, that the performance is quite bad with respect to k. We demonstrate that the more recent skyline algorithms are better behaved, and can also achieve O(kn) average-case. While k matters for these, in practice, its effect vanishes in the asymptotic. We introduce a new external algorithm, LESS, that is more efficient and better behaved. We evaluate LESS's effectiveness and improvement over the field, and prove that its average-case running time is O(kn).
引用
收藏
页码:5 / 28
页数:24
相关论文
共 34 条
[1]  
Balke W., 2004, P 30 INT C VERY LARG, P936, DOI DOI 10.1016/B978-012088469-8.50082-6
[2]  
BALKE WT, 2005, IJCAI 05 WORKSH ADV, P1
[3]  
BALKE WT, 2004, IASTED INT C INT MUL, P1
[4]   ON DISTRIBUTION OF NUMBER OF ADMISSIBLE POINTS IN A VECTOR RANDOM SAMPLE [J].
BARNDORF.O ;
SOBEL, M .
THEORY OF PROBILITY AND ITS APPLICATIONS,USSR, 1966, 11 (02) :249-&
[5]   AVERAGE NUMBER OF MAXIMA IN A SET OF VECTORS AND APPLICATIONS [J].
BENTLEY, JL ;
KUNG, HT ;
SCHKOLNICK, M ;
THOMPSON, CD .
JOURNAL OF THE ACM, 1978, 25 (04) :536-543
[6]  
BENTLEY JL, 1990, PROCEEDINGS OF THE FIRST ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P179
[7]  
Blum M., 1973, Journal of Computer and System Sciences, V7, P448, DOI 10.1016/S0022-0000(73)80033-9
[8]   The Skyline operator [J].
Börzsönyi, S ;
Kossmann, D ;
Stocker, K .
17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, :421-430
[9]   ON THE AVERAGE NUMBER OF MAXIMA IN A SET OF VECTORS [J].
BUCHTA, C .
INFORMATION PROCESSING LETTERS, 1989, 33 (02) :63-65
[10]  
Chan C.-Y., 2005, ACM SIGMOD, P203