Applying the archetype approach to the database of a biobank information management system

被引:36
作者
Spaeth, Melanie Bettina [1 ]
Grimson, Jane [1 ]
机构
[1] Trinity Coll Dublin, Sch Comp Sci & Stat, Ctr Hlth Informat, Dublin 2, Ireland
关键词
Biological Specimen Banks; Biobanks; Electronic Health Record; openEHR archetypes and templates; Biobank information management system; ELECTRONIC HEALTH RECORDS; CLINICAL-DATA; CARE RECORD; EXTRACTION; STANDARDS; CANCER;
D O I
10.1016/j.ijmedinf.2010.11.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Purpose: The purpose of this study is to investigate the feasibility of applying the openEHR archetype approach to modelling the data in the database of an existing proprietary biobank information management system. A biobank information management system stores the clinical/phenotypic data of the sample donor and sample related information. The clinical/phenotypic data is potentially sourced from the donor's electronic health record (EHR). The study evaluates the reuse of openEHR archetypes that have been developed for the creation of an interoperable EHR in the context of biobanking, and proposes a new set of archetypes specifically for biobanks. The ultimate goal of the research is the development of an interoperable electronic biomedical research record (eBMRR) to support biomedical knowledge discovery. Methods: The database of the prostate cancer biobank of the Irish Prostate Cancer Research Consortium (PCRC), which supports the identification of novel biomarkers for prostate cancer, was taken as the basis for the modelling effort. First the database schema of the biobank was analyzed and reorganized into archetype-friendly concepts. Then, archetype repositories were searched for matching archetypes. Some existing archetypes were reused without change, some were modified or specialized, and new archetypes were developed where needed. The fields of the biobank database schema were then mapped to the elements in the archetypes. Finally, the archetypes were arranged into templates specifically to meet the requirements of the PCRC biobank. Results: A set of 47 archetypes was found to cover all the concepts used in the biobank. Of these, 29 (62%) were reused without change, 6 were modified and/or extended, 1 was specialized, and 11 were newly defined. These archetypes were arranged into 8 templates specifically required for this biobank. A number of issues were encountered in this research. Some arose from the immaturity of the archetype approach, such as immature modelling support tools, difficulties in defining high-quality archetypes and the problem of overlapping archetypes. In addition, the identification of suitable existing archetypes was time-consuming and many semantic conflicts were encountered during the process of mapping the PCRC BIMS database to existing archetypes. These include differences in the granularity of documentation, in metadata-level versus data-level modelling, in terminologies and vocabularies used, and in the amount of structure imposed on the information to be recorded. Furthermore, the current way of modelling the sample entity was found to be cumbersome in the sample-centric activity of biobanking. The archetype approach is a promising approach to create a shareable eBMRR based on the study participant/donor for biobanks. Many archetypes originally developed for the EHR domain can be reused to model the clinical/phenotypic and sample information in the biobank context, which validates the genericity of these archetypes and their potential for reuse in the context of biomedical research. However, finding suitable archetypes in the repositories and establishing an exact mapping between the fields in the PCRC BIMS database and the elements of existing archetypes that have been designed for clinical practice can be challenging and time-consuming and involves resolving many common system integration conflicts. These may be attributable to differences in the requirements for information documentation between clinical practice and biobanking. This research also recognized the need for better support tools, modelling guidelines and best practice rules and reconfirmed the need for better domain knowledge governance. Furthermore, the authors propose that the establishment of an independent sample record with the sample as record subject should be investigated. The research presented in this paper is limited by the fact that the new archetypes developed during this research are based on a single biobank instance. These new archetypes may not be complete, representing only those subsets of items required by this particular database. Nevertheless, this exercise exposes some of the gaps that exist in the archetype modelling landscape and highlights the concepts that need to be modelled with archetypes to enable the development of an eBMRR. (C) 2010 Elsevier Ireland Ltd. All rights reserved.
引用
收藏
页码:205 / 226
页数:22
相关论文
共 67 条
[1]   National Mesothelioma Virtual Bank: A standard based biospecimen and clinical data resource to enhance translational research [J].
Amin, Waqas ;
Parwani, Anil V. ;
Schmandt, Linda ;
Mohanty, Sambit K. ;
Farhat, Ghada ;
Pople, Andrew K. ;
Winters, Sharon B. ;
Whelan, Nancy B. ;
Schneider, Althea M. ;
Milnes, John T. ;
Valdivieso, Federico A. ;
Feldman, Michael ;
Pass, Harvey I. ;
Dhir, Rajiv ;
Melamed, Jonathan ;
Becich, Michael J. .
BMC CANCER, 2008, 8 (1)
[2]  
Asslaber Martin, 2007, Briefings in Functional Genomics & Proteomics, V6, P193, DOI 10.1093/bfgp/elm023
[3]  
BEALE T, 2002, ARCHETYPES CONSTRAIN, P16
[4]  
Beale T, 2008, OPENEHR ARCHITECTURE, P1
[5]  
BEALE T, ARCHETYPES CONSTRAIN, P2000
[6]  
BEALE T, 2010, OPENEHR ARCHETYPE MO
[7]  
Beale T., 2007, Openehr architecture: Architecture overview
[8]  
Beale T., 2007, Archetype Definition Language Version 2
[9]  
BEALE T, 2008, OPENEHR ARCHETYPE MO
[10]  
Bird L, 2003, J RES PRACT INF TECH, V35, P121