Big Data Health Care Platform With Multisource Heterogeneous Data Integration and Massive High-Dimensional Data Governance for Large Hospitals: Design, Development, and Application

被引:25
作者
Wang, Miye [1 ]
Li, Sheyu [2 ]
Zheng, Tao [1 ]
Li, Nan [1 ]
Shi, Qingke [1 ]
Zhuo, Xuejun [1 ]
Ding, Renxin [1 ]
Huang, Yong [1 ]
机构
[1] Sichuan Univ, Engn Res Ctr Med Informat Technol, West China Hosp, Minist Educ, Chengdu, Sichuan, Peoples R China
[2] Sichuan Univ, West China Hosp, MAGIC China Ctr, Cochrane China Ctr,Dept Endocrinol & Metab, Chengdu, Peoples R China
关键词
big data platform in health care; multisource; heterogeneous; data integration; data governance; data application; data security; data quality control; big data; data science; medical informatics; health care; DATA QUALITY;
D O I
10.2196/36481
中图分类号
R-058 [];
学科分类号
摘要
Background: With the advent of data-intensive science, a full integration of big data science and health care will bring a cross-field revolution to the medical community in China The concept big data represents not only a technology but also a resource and a method. Big data are regarded as an important strategic resource both at the national level and at the medical institutional level, thus great importance has been attached to the construction of a big data platform for health care. Objective: We aimed to develop and implement a big data platform for a large hospital, to overcome difficulties in integrating, calculating, storing, and governing multisource heterogeneous data in a standardized way, as well as to ensure health care data security. Methods: The project to build a big data platform at West China Hospital of Sichuan University was launched in 2017. The West China Hospital of Sichuan University big data platform has extracted, integrated, and governed data from different departments and sections of the hospital since January 2008. A master-slave mode was implemented to realize the real-time integration of multisource heterogeneous massive data, and an environment that separates heterogeneous characteristic data storage and calculation processes was built. A business-based metadata model was improved for data quality control, and a standardized health care data governance system and scientific closed-loop data security ecology were established. Results: After 3 years of design, development, and testing, the West China Hospital of Sichuan University big data platform was formally brought online in November 2020. It has formed a massive multidimensional data resource database, with more than 12.49 million patients, 75.67 million visits, and 8475 data variables. Along with hospital operations data, newly generated data are entered into the platform in real time. Since its launch, the platform has supported more than 20 major projects and provided data service, storage, and computing power support to many scientific teams, facilitating a shift in the data support model-from conventional manual extraction to self-service retrieval (which has reached 8561 retrievals per month). Conclusions: The platform can combine operation systems data from all departments and sections in a hospital to form a massive high-dimensional high-quality health care database that allows electronic medical records to be used effectively and taps into the value of data to fully support clinical services, scientific research, and operations management. The West China Hospital of Sichuan University big data platform can successfully generate multisource heterogeneous data storage and computing power. By effectively governing massive multidimensional data gathered from multiple sources, the West China Hospital of Sichuan University big data platform provides highly available data assets and thus has a high application value in the health care field. The West China Hospital of Sichuan University big data platform facilitates simpler and more efficient utilization of electronic medical record data for real-world research.
引用
收藏
页码:196 / 210
页数:15
相关论文
共 36 条
  • [1] DataTags, Data Handling Policy Spaces and the Tags Language
    Bar-Sinai, Michael
    Sweeney, Latanya
    Crosas, Merce
    [J]. 2016 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2016), 2016, : 1 - 8
  • [2] The Korea Cancer Big Data Platform (K-CBP) for Cancer Research
    Cha, Hyo Soung
    Jung, Jip Min
    Shin, Seob Yoon
    Jang, Young Mi
    Park, Phillip
    Lee, Jae Wook
    Chung, Seung Hyun
    Choi, Kui Son
    [J]. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2019, 16 (13)
  • [3] Chang Z, 2016, CHINA DIGIT MED, V09, P2, DOI [10.3969/j.issn.1673-7571.2016.09.001, DOI 10.3969/J.ISSN.1673-7571.2016.09.001]
  • [4] data.gov, FIND OP DAT
  • [5] Fei Xiaolu, 2018, CHINESE J HLTH INFOR, V15, P554
  • [6] Fu H, 2019, CHIN J LIB INF SCI, V03, P1, DOI [10.3969/j.issn.2095-5707.2019.03.001, DOI 10.3969/J.ISSN.2095-5707.2019.03.001]
  • [7] Gong C, 2019, Libr J, V38, P17, DOI [10.13663/j.cnki.lj.2019.08.002, DOI 10.13663/J.CNKI.LJ.2019.08.002]
  • [8] Ji H, 2017, CHIN J HLTH INFORM M, V4, P525, DOI [10.3969/j.issn.1672-5166.2017.04.01, DOI 10.3969/J.ISSN.1672-5166.2017.04.01]
  • [9] Designing Data Governance
    Khatri, Vijay
    Brown, Carol V.
    [J]. COMMUNICATIONS OF THE ACM, 2010, 53 (01) : 148 - 152
  • [10] Challenges and Opportunities of Big Data in Health Care: A Systematic Review
    Kruse, Clemens Scott
    Goswamy, Rishi
    Raval, Yesha
    Marawi, Sarah
    [J]. JMIR MEDICAL INFORMATICS, 2016, 4 (04)