Evaluation of synthetic electronic health records: A systematic review and experimental assessment

被引:1
作者
Budu, Emmanuella [1 ]
Etminani, Kobra [1 ]
Soliman, Amira [1 ]
Rognvaldsson, Thorsteinn [1 ]
机构
[1] Halmstad Univ, Ctr Appl Intelligent Syst Res CAISR, Kristian IV s vag 3, S-30118 Halmstad, Sweden
关键词
Synthetic data; Electronic health records (EHRs); Evaluation; GENERATION; FRAMEWORK; PRIVACY;
D O I
10.1016/j.neucom.2024.128253
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies have shown how synthetic data generation methods can be applied to electronic health records (EHRs) to obtain synthetic versions that do not violate privacy rules. This growing body of research has resulted in the emergence of numerous methods for evaluating the quality of generated data, with new publications often introducing novel evaluation methods. This work presents a detailed review of synthetic EHRs, focusing on the various evaluation methods used to assess the quality of the generated EHRs. We discuss the existing evaluation methods, offering insights into their use as well as providing an interpretation of the evaluation metrics from the perspectives of achieving fidelity, , utility and privacy. . Furthermore, we highlight the key factors influencing the selection of evaluation methods, such as the type of data (e.g., categorical, continuous, or discrete) and the mode of application (e.g., patient level, cohort level, and feature level). To assess the effectiveness of current evaluation measures, we conduct a series of experiments to shed light on the potential limitations of these measures. The findings from these experiments reveal notable shortcomings, including the need for meticulous application of methods to the data to reduce inconsistent evaluations, the qualitative nature of some assessments subject to individual judgment, the need for clinical validations, and the absence of techniques to evaluate temporal dependencies within the data. This highlights the need to place greater emphasis on evaluation measures, their application, and the development of comprehensive evaluation frameworks as it is crucial for advancing progress in this field.
引用
收藏
页数:21
相关论文
共 56 条
[1]  
Arvanitis TN, 2021, medRxiv, DOI [10.1101/2021.02.11.21250741, 10.1101/2021.02.11.21250741, DOI 10.1101/2021.02.11.21250741]
[2]   The usefulness of the Deep Learning method of variational autoencoder to reduce measurement noise in glaucomatous visual fields [J].
Asaoka, Ryo ;
Murata, Hiroshi ;
Asano, Shotaro ;
Matsuura, Masato ;
Fujino, Yuri ;
Miki, Atsuya ;
Tanito, Masaki ;
Mizoue, Shiro ;
Mori, Kazuhiko ;
Suzuki, Katsuyoshi ;
Yamashita, Takehiro ;
Kashiwagi, Kenji ;
Shoji, Nobuyuki .
SCIENTIFIC REPORTS, 2020, 10 (01)
[3]   Synthesizing electronic health records using improved generative adversarial networks [J].
Baowaly, Mrinal Kanti ;
Lin, Chia-Ching ;
Liu, Chao-Lin ;
Chen, Kuan-Ta .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2019, 26 (03) :228-241
[4]   Conditional generation of medical time series for extrapolation to underrepresented populations [J].
Bing, Simon ;
Dittadi, Andrea ;
Bauer, Stefan ;
Schwab, Patrick .
PLOS DIGITAL HEALTH, 2022, 1 (07)
[5]  
Biswal S., 2021, Proceedings of Machine Learning Research, V149, P22
[6]  
Budu E., 2023, EmmanuellaBudu/synthetic-EHR-evaluation
[7]   A Framework for Evaluating Synthetic Electronic Health Records [J].
Budu, Emmanuella ;
Soliman, Amira ;
Etminani, Kobra ;
Rognvaldsson, Thorsteinn .
CARING IS SHARING-EXPLOITING THE VALUE IN DATA FOR HEALTH AND INNOVATION-PROCEEDINGS OF MIE 2023, 2023, 302 :378-379
[8]   Boosting Deep Learning Risk Prediction with Generative Adversarial Networks for Electronic Health Records [J].
Che, Zhengping ;
Cheng, Yu ;
Zha, Shuangfei ;
Sun, Zhaonan ;
Liu, Yan .
2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, :787-792
[9]   Synthetic data in machine learning for medicine and healthcare [J].
Chen, Richard J. ;
Lu, Ming Y. ;
Chen, Tiffany Y. ;
Williamson, Drew F. K. ;
Mahmood, Faisal .
NATURE BIOMEDICAL ENGINEERING, 2021, 5 (06) :493-497
[10]  
Chin-Cheong K., 2019, Generation of Heterogeneous Synthetic Electronic Health Records using GANs, DOI [DOI 10.3929/ETHZ-B-000392473, 10.3929/ETHZ-B-000392473]