Automating layout of relational databases

被引:10
作者
Agrawal, S
Chaudhuri, S
Das, A
Narasayya, V
机构
来源
19TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS | 2003年
关键词
D O I
10.1109/ICDE.2003.1260825
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The choice of database layout, i.e., how database objects such as tables and indexes are assigned to disk drives can significantly impact the I/O performance of the system. Today, DBAs typically rely on fully striping objects across ail available disk drives as the basic mechanism for optimizing I/O performance. While full striping maximizes I/O parallelism, when query execution involves co-access of two or more large objects, e.g., a merge join of two tables, the above strategy may be suboptimal due to the increased number of random I/O accesses on each disk drive. In this paper, we propose a framework for automating the choice of database layout for a given database that also takes into account the effects of co-accessed objects in the workload faced by the system. We formulate the above as an optimization problem and present an efficient solution to the problem that judiciously takes into account the trade-off between I/O parallelism and random I/O accesses. Our experiments on Microsoft SQL Server show the superior I/O performance of our techniques compared to the traditional approach of fully striping each database object across all disk drives.
引用
收藏
页码:607 / 618
页数:12
相关论文
共 16 条
[1]  
ALVAREZ GA, 2001, ACM T COMPUTER SYSTE
[2]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theoryof NP-Completeness
[3]  
[Anonymous], BELL SYSTEM TECHNICA
[4]  
CHEN C, P PODS 2002, P29
[5]  
COPELAND G, P SIGMOD 1988
[6]  
Dewan H. M., 1994, Proceedings of the Third International Conference on Parallel and Distributed Information Systems (Cat. No.94TH0668-4), P40, DOI 10.1109/PDIS.1994.331734
[7]  
GRAY J, P VLDB 1990
[8]  
LEE L, 1998, IEEE T COMPUTERS
[9]  
LEE M, P SIGMOD 2000
[10]  
LIVNY M, P SIGMOD 1987