The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics

被引:32
作者
Castro, Victor M. [1 ]
Gainer, Vivian [1 ]
Wattanasin, Nich [1 ]
Benoit, Barbara [1 ]
Cagan, Andrew [1 ]
Ghosh, Bhaswati [1 ]
Goryachev, Sergey [1 ]
Metta, Reeta [1 ]
Park, Heekyong [1 ]
Wang, David [1 ]
Mendis, Michael [1 ]
Rees, Martin [1 ]
Herrick, Christopher [1 ]
Murphy, Shawn N. [1 ,2 ,3 ]
机构
[1] Mass Gen Brigham, Res Informat Sci & Comp, 399 Revolut Dr, Somerville, MA 02145 USA
[2] Massachusetts Gen Hosp, Dept Neurol, Boston, MA 02114 USA
[3] Harvard Med Sch, Boston, MA 02115 USA
关键词
Information storage and retrieval; data curation; data science; genomics; electronic health records; i2b2; HEALTH; INFORMATICS; CT;
D O I
10.1093/jamia/ocab264
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective Integrating and harmonizing disparate patient data sources into one consolidated data portal enables researchers to conduct analysis efficiently and effectively. Materials and Methods We describe an implementation of Informatics for Integrating Biology and the Bedside (i2b2) to create the Mass General Brigham (MGB) Biobank Portal data repository. The repository integrates data from primary and curated data sources and is updated weekly. The data are made readily available to investigators in a data portal where they can easily construct and export customized datasets for analysis. Results As of July 2021, there are 125 645 consented patients enrolled in the MGB Biobank. 88 527 (70.5%) have a biospecimen, 55 121 (43.9%) have completed the health information survey, 43 552 (34.7%) have genomic data and 124 760 (99.3%) have EHR data. Twenty machine learning computed phenotypes are calculated on a weekly basis. There are currently 1220 active investigators who have run 58 793 patient queries and exported 10 257 analysis files. Discussion The Biobank Portal allows noninformatics researchers to conduct study feasibility by querying across many data sources and then extract data that are most useful to them for clinical studies. While institutions require substantial informatics resources to establish and maintain integrated data repositories, they yield significant research value to a wide range of investigators. Conclusion The Biobank Portal and other patient data portals that integrate complex and simple datasets enable diverse research use cases. i2b2 tools to implement these registries and make the data interoperable are open source and freely available.
引用
收藏
页码:643 / 651
页数:9
相关论文
共 38 条
[1]   The "All of Us" Research Program [J].
Denny J.C. ;
Rutter J.L. ;
Goldstein D.B. ;
Philippakis A. ;
Smoller J.W. ;
Jenkins G. ;
Dishman E. .
NEW ENGLAND JOURNAL OF MEDICINE, 2019, 381 (07) :668-676
[2]  
[Anonymous], 2021, The Book of OHDSI
[3]  
[Anonymous], 2021, ICD ICD 10 CM INT CL
[4]   Implementing a hash-based privacy-preserving record linkage tool in the OneFlorida clinical research network [J].
Bian, Jiang ;
Loiacono, Alexander ;
Sura, Andrei ;
Viramontes, Tonatiuh Mendoza ;
Lipori, Gloria ;
Guo, Yi ;
Shenkman, Elizabeth ;
Hogan, William .
JAMIA OPEN, 2019, 2 (04) :562-569
[5]   Ethnic differences in coronary calcification - The multi-ethnic study of atherosclerosis (MESA) [J].
Bild, DE ;
Detrano, R ;
Peterson, D ;
Guerci, A ;
Liu, K ;
Shahar, E ;
Ouyang, P ;
Jackson, S ;
Saad, MF .
CIRCULATION, 2005, 111 (10) :1313-1320
[6]   Identification of a new genetic variant associated with cholecystitis: A multicenter genome-wide association study [J].
Bonde, Alexander ;
Gaitanidis, Apostolos ;
Breen, Kerry ;
El Hechi, Majed ;
Nederpelt, Charlie ;
Christensen, Mathias ;
Kokoroskos, Nikolaos ;
Mendoza, April ;
Velmahos, George ;
Sillesen, Martin ;
Farhat, Maha R. ;
Kaafarani, Haytham M. A. .
JOURNAL OF TRAUMA AND ACUTE CARE SURGERY, 2020, 89 (01) :173-178
[7]   The Information Technology Infrastructure for the Translational Genomics Core and the Partners Biobank at Partners Personalized Medicine [J].
Boutin, Natalie ;
Holzbach, Ana ;
Mahanta, Lisa ;
Aldama, Jackie ;
Cerretani, Xander ;
Embree, Kevin ;
Leon, Irene ;
Rathi, Neeta ;
Vickers, Matilde .
JOURNAL OF PERSONALIZED MEDICINE, 2016, 6 (01) :1-6
[8]   Implementation of Electronic Consent at a Biobank: An Opportunity for Precision Medicine Research [J].
Boutin, Natalie T. ;
Mathieu, Kathleen ;
Hoffnagle, Alison G. ;
Allen, Nicole L. ;
Castro, Victor M. ;
Morash, Megan ;
O'Rourke, P. Pearl ;
Hohmann, Elizabeth L. ;
Herring, Neil ;
Bry, Lynn ;
Slaugenhaupt, Susan A. ;
Karlson, Elizabeth W. ;
Weiss, Scott T. ;
Smoller, Jordan W. .
JOURNAL OF PERSONALIZED MEDICINE, 2016, 6 (02)
[9]   Fully-Automated Analysis of Body Composition from CT in Cancer Patients Using Convolutional Neural Networks [J].
Bridge, Christopher P. ;
Rosenthal, Michael ;
Wright, Bradley ;
Kotecha, Gopal ;
Fintelmann, Florian ;
Troschel, Fabian ;
Miskin, Nityanand ;
Desai, Khanant ;
Wrobel, William ;
Babic, Ana ;
Khalaf, Natalia ;
Brais, Lauren ;
Welch, Marisa ;
Zellers, Caitlin ;
Tenenholtz, Neil ;
Michalski, Mark ;
Wolpin, Brian ;
Andriole, Katherine .
OR 2.0 CONTEXT-AWARE OPERATING THEATERS, COMPUTER ASSISTED ROBOTIC ENDOSCOPY, CLINICAL IMAGE-BASED PROCEDURES, AND SKIN IMAGE ANALYSIS, OR 2.0 2018, 2018, 11041 :204-213
[10]   The UK Biobank resource with deep phenotyping and genomic data [J].
Bycroft, Clare ;
Freeman, Colin ;
Petkova, Desislava ;
Band, Gavin ;
Elliott, Lloyd T. ;
Sharp, Kevin ;
Motyer, Allan ;
Vukcevic, Damjan ;
Delaneau, Olivier ;
O'Connell, Jared ;
Cortes, Adrian ;
Welsh, Samantha ;
Young, Alan ;
Effingham, Mark ;
McVean, Gil ;
Leslie, Stephen ;
Allen, Naomi ;
Donnelly, Peter ;
Marchini, Jonathan .
NATURE, 2018, 562 (7726) :203-+