Exploring performance issues for a clinical database organized using an entity-attribute-value representation

被引:35
作者
Chen, RS
Nadkarni, P
Marenco, L
Levin, F
Erdos, J
Miller, PL
机构
[1] Yale Univ, Sch Med, Ctr Med Informat, New Haven, CT 06520 USA
[2] Evergreen Design, Guilford, CT USA
[3] Vet Affairs Med Ctr, West Haven, CT USA
关键词
D O I
10.1136/jamia.2000.0070475
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Background: The entity-attribute-value representation with classes and relationships (EAV/CR) provides a flexible and simple database schema to store heterogeneous biomedical data. In certain circumstances, however, the EAV/CR model is known to retrieve data less efficiently than conventionally based database schemas. Objective: To perform a pilot study that systematically quantifies performance differences for database queries directed at real-world microbiology data modeled with EAV/CR and conventional representations, and to explore the relative merits of different EAV/CR query implementation strategies. Methods: Clinical microbiology data obtained over a ten-year period were stored using both database models. Query execution times were compared for four clinically oriented attribute-centered and entity-centered queries operating under varying conditions of database size and system memory. The performance characteristics of three different EAV/CR query strategies were also examined. Results: Performance was similar for entity-centered queries in the two database models. Performance in the EAV/CR model was approximately three to five times less efficient than its conventional counterpart for attribute-centered queries. The differences in query efficiency became slightly greater as database size increased, although they were reduced with the addition of system memory. The authors found that EAV/CR queries formulated using multiple, simple SQL statements executed in batch were more efficient than single, large SQL statements. Conclusion: This paper describes a pilot project to explore issues in and compare query performance for EAV/CR and conventional database representations. Although attribute-centered queries were less efficient in the EAV/CR model, these inefficiencies may be addressable, at least in part, by the use of more powerful hardware or more memory, or both.
引用
收藏
页码:475 / 487
页数:13
相关论文
共 17 条
[1]  
Celko J., 1996, DBMS, V9
[2]  
CELKO J, 1996, SQL SMARTIES TECHNIQ
[3]  
DELORME J, 1980, AM J CLIN PATHOL, V74, P51
[4]  
*DEP VET AFF, 1994, DEC HOSP COMP SYST V
[5]  
Friedman C., 1990, Fourteenth Annual Symposium on Computer Applications in Medical Care. Standards in Medical Informatics. A Conference of the American Medical Informatics Association, P335
[6]  
Huff S M, 1991, Proc Annu Symp Comput Appl Med Care, P386
[7]  
HUFF SM, 1994, J AM MED INFORM ASSN, P271
[8]  
JOHNSON S, 1990, P 14 S COMP APPL MED, P340
[9]   Data extraction and ad hoc query of an entity - Attribute - Value database [J].
Nadkarni, PM ;
Brandt, C .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 1998, 5 (06) :511-527
[10]   Organization of heterogeneous scientific data using the EAV/CR representation [J].
Nadkarni, PM ;
Marenco, L ;
Chen, R ;
Skoufos, E ;
Shepherd, G ;
Miller, P .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 1999, 6 (06) :478-493