On analyzing the errors in a selectivity estimation method using a multidimensional file structure

被引:0
|
作者
Kim, SW [1 ]
Whang, WK [1 ]
Whang, KY [1 ]
机构
[1] Kangweon Natl Univ, Dept Informat & Telecommun Engn, Chunchon 200701, Kangwon Do, South Korea
关键词
D O I
10.1109/CMPSAC.1998.716635
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
lit this paper, we discuss the errors in selectivity estimation using the multilevel grid file(MLGF), a file structure. cause of the estimation errors, and then investigate five factors affecting the accuracy of estimation: (I) the data distribution in a region, (2) the number of records stored in the MLGF, (3) the page size, (4) the query region size, and (5) the level of the MLGF directory. Next, we present through extensive experiments the tendency of estimation errors when the value for each factor changes. The results show that the errors decrease when (1) the distribution of records in a region becomes closer to the uniform one, (2) the number of records in the MLGF increases, (3) the page size decreases, (4) the query region size increases, and (5) the level of the MLGF directory containing data distribution information becomes lower. We define the granule ratio, the core formula representing the basic relationship between the estimation error and the above five factors, and finally examine the change of estimation errors in relation with the change of the granule ratio through experiments. The results indicate that with a specific value for the granule ratio, errors tend to be similar regardless bf different values for the five factors.
引用
收藏
页码:48 / 54
页数:7
相关论文
共 50 条