A data-driven multidimensional indexing method for data mining in astrophysical databases

被引:0
作者
Frailis, M
De Angelis, A
Roberto, V
机构
[1] Univ Udine, Dipartimento Fis, I-33100 Udine, Italy
[2] Ist Nazl Fis Nucl, Grp Coll Udine, Sez Trieste, I-33100 Udine, Italy
[3] Univ Udine, Dipartimento Matemat & Informat, I-33100 Udine, Italy
关键词
multidimensional indexing; VAMSplit R-tree; nearest-neighbor query; one-class SVM; point sources;
D O I
10.1155/ASP.2005.2514
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Large archives and digital sky surveys with dimensions of loll bytes currently exist, while in the near future they will reach sizes of the order of 10(15). Numerical simulations are also producing comparable volumes of information. Data mining tools are needed for information extraction from such large datasets. In this work, we propose a multidimensional indexing method, based on a static R-tree data structure, to efficiently query and mine large astrophysical datasets. We follow a top-down construction method, called VAMSplit, which recursively splits the dataset on a near median element along the dimension with maximum variance. The obtained index partitions the dataset into nonoverlapping bounding boxes, with volumes proportional to the local data density. Finally, we show an application of this method for the detection of point sources from a gamma-ray photon list.
引用
收藏
页码:2514 / 2520
页数:7
相关论文
共 15 条
[1]  
[Anonymous], 2004, KERNEL METHODS PATTE
[2]  
Arge L, 1999, LECT NOTES COMPUT SC, V1619, P328
[3]  
Banday A.J., 2001, P MPA ESO MPE WORKSH, P631
[4]  
BECKMANN N, 1990, SIGMOD REC, V19, P322, DOI 10.1145/93605.98741
[5]  
BOHM C, 1999, P 1 INT C DAT WAR KN, V31, P251
[6]  
BOLEY D, 2004, P 4 SIAM INT C DAT M
[7]  
BRUNNER RJ, 2002, INVITED REV HDB MASS, P931
[8]  
Friedman J., 2001, The elements of statistical learning, V1, DOI DOI 10.1007/978-0-387-21606-5
[9]   Multidimensional access methods [J].
Gaede, V ;
Gunther, O .
ACM COMPUTING SURVEYS, 1998, 30 (02) :170-231
[10]  
Guttman A., 1984, SIGMOD Record, V14, P47, DOI 10.1145/971697.602266