AN EFFICIENT DISK BASED DATA STRUCTURE FOR RAPID SEARCHING OF QUANTITATIVE 2-DIMENSIONAL GEL DATABASES

被引:3
|
作者
LEMKIN, PF
WU, YC
UPTON, K
机构
[1] CSPI,SCANALYT,BILLERICA,MA
[2] FCRDC,PROGRAM RESOURCES INC,FREDERICK,MD
关键词
D O I
10.1002/elps.11501401207
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Fast access of two-dimensional (2-D) gel quantitative databases is important for rapid searching for protein differences between sets of 2-D gels from an experiment. The GELLAB-II system organizes corresponding spots from the gels in the database into reference or ''Rspot'' sets. These Rspot numeric names index fixed regions in the paged composite gel database file. This is adequate for an existing database, but has several problems. (i) Building the initial database requires guessing how much disk space to pre-allocate for each corresponding spot (i.e. spots from different gels). If it ever runs out of preallocated space during this process, it must expand the size of each corresponding set of spots copying the old database data into the new in-place on the disk. (ii) When adding new gels or editing the database, if a new spot is created, the system may also go into this expansion mode. The time spent and wasted disk space can be appreciable - depending on the size of the database (order of 100 gel database). (iii) Because each set of corresponding spots is the same size, we waste space in most spot sets since they do not require the additional space a few spot sets require which contain additional fragmented spots. We present a new low-level disk object-based structure and algorithm, paged indexed buckets (PIB), which optimizes disk space usage while having similar retrieval speed to the original method.
引用
收藏
页码:1341 / 1350
页数:10
相关论文
共 50 条