Run Placement Policies for Concurrent Mergesorts Using Parallel Prefetching

被引:0
作者
Kun-Lung Wu
Philip S. Yu
James Z. Teng
机构
[1] IBM T. J. Watson Research Center,
[2] IBM Programming Systems,undefined
关键词
Run placement; I/O parallelism; parallel prefetching; thrashing control; buffer management; mergesorts; concurrent merges;
D O I
10.1007/BF03325109
中图分类号
学科分类号
摘要
We study the performance of various run placement policies on disks for the merge phase of concurrent mergesorts using parallel prefetching. The initial sorted runs (input) of a merge and its final sorted run (output) are stored on multiple disks but each run resides only on a single disk. In this paper, we examine through detailed simulations three different run placement policies and the impact of buffer thrashing. The results show that, with buffer thrashing avoidance, the best performance can be achieved by a run placement policy that uses a proper subset of the disks dedicated for writing the output runs while the rest of the disks are used for prefetching the input runs in parallel. However, the proper number of write disks is workload dependent, and if not carefully chosen, it can adversely affect the system performance. In practice, a reasonably good performance can be achieved by a run placement policy that does not place the output run of a merge on any of the disks that store its own input runs but allows the output run to share the same disk with some of the input runs of other merges.
引用
收藏
页码:435 / 457
页数:22
相关论文
共 15 条
[1]  
Aggarwal A(1988)The input/output complexity of sorting and related problems Communications of the ACM 31 1116-1127
[2]  
Vitter J S(1983)Parallel algorithms for relational database operations ACM Trans. on Database Systems 8 324-353
[3]  
De Witt DJ(1985)The I/O performance of multiway mergesort and tag sort IEEE Trans. Computers 34 383-387
[4]  
Bitton D(1986)Buffer management in relational database systems ACM Trans. Database Systems 11 473-498
[5]  
Boral H(1989)Merging sorted runs using large main memory Acta Informatica 27 195-215
[6]  
Wilkinson W K(1993)Buffer management based on return on consumption in a multi-query environment VLDB Journal 2 1-37
[7]  
Kwan S C(1996)Speeding up external mergesort IEEE Trans. Knowledge and Data Engineering 8 322-332
[8]  
Baer J L(undefined)undefined undefined undefined undefined-undefined
[9]  
Sacco G M(undefined)undefined undefined undefined undefined-undefined
[10]  
Schkolnick M(undefined)undefined undefined undefined undefined-undefined