Machine Learning for Medical Data Integration

被引:1
作者
Mueller, Armin [1 ]
Christmann, Lara-Sophie [2 ]
Kohler, Severin [2 ]
Eils, Roland [2 ]
Prasser, Fabian [1 ]
机构
[1] Charit Univ Med Berlin, Berlin Inst Hlth, Ctr Hlth Data Sci, Charit Pl 1, D-10117 Berlin, Germany
[2] Charit Univ Med Berlin, Digital Hlth Ctr, Berlin Inst Hlth, Charit Pl 1, D-10117 Berlin, Germany
来源
CARING IS SHARING-EXPLOITING THE VALUE IN DATA FOR HEALTH AND INNOVATION-PROCEEDINGS OF MIE 2023 | 2023年 / 302卷
关键词
medical data integration; common data models; machine learning;
D O I
10.3233/SHTI230241
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Making health data available for secondary use enables innovative data-driven medical research. Since modern machine learning (ML) methods and precision medicine require extensive amounts of data covering most of the standard and edge cases, it is essential to initially acquire large datasets. This can typically only be achieved by integrating different datasets from various sources and sharing data across sites. To obtain a unified dataset from heterogeneous sources, standard representations and Common Data Models (CDM) are needed. The process of mapping data into these standardized representations is usually very tedious and requires many manual configuration and refinement steps. A potential way to reduce these efforts is to use ML methods not only for data analysis, but also for the integration of health data on the syntactic, structural, and semantic level. However, research on ML-based medical data integration is still in its infancy. In this article, we describe the current state of the literature and present selected methods that appear to have a particularly high potential to improve medical data integration. Moreover, we discuss open issues and possible future research directions.
引用
收藏
页码:691 / 695
页数:5
相关论文
共 25 条
  • [1] Anderson MR, 2019, dissertation
  • [2] [Anonymous], 2013, The Data Warehouse Toolkit, DOI DOI 10.1016/j.datak.2005.11.004
  • [3] Bender D, 2013, COMP MED SY, P326, DOI 10.1109/CBMS.2013.6627810
  • [4] Casters Matt., 2010, Pentaho Kettle solutions: building open source ETL solutions with Pentaho Data Integration
  • [5] Data Integration and Machine Learning: A Natural Synergy
    Dong, Xin Luna
    Rekatsinas, Theodoros
    [J]. SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 1645 - 1650
  • [6] DeepTable: a permutation invariant neural network for table orientation classification
    Habibi, Maryam
    Starlinger, Johannes
    Leser, Ulf
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (06) : 1963 - 1983
  • [7] Revolutionizing medicine in the 21st century through systems approaches
    Hood, Leroy
    Balling, Rudi
    Auffray, Charles
    [J]. BIOTECHNOLOGY JOURNAL, 2012, 7 (08) : 992 - 1001
  • [8] Sherlock: A Deep Learning Approach to Semantic Data Type Detection
    Hulsebos, Madelon
    Hu, Kevin
    Bakker, Michiel
    Zgraggen, Emanuel
    Satyanarayan, Arvind
    Kraska, Tim
    Demiralp, Cagatay
    Hidalgo, Cesar
    [J]. KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 1500 - 1508
  • [9] Kalra Dipak, 2005, Stud Health Technol Inform, V115, P153
  • [10] Towards Converting Clinical Phrases into SNOMED CT Expressions
    Kate, Rohit J.
    [J]. BIOMEDICAL INFORMATICS INSIGHTS, 2013, 6 : 29 - 37