Machine Learning for Automatic Encoding of French Electronic Medical Records: Is More Data Better ?

被引:1
|
作者
Gobeill, Julien [1 ,2 ]
Ruch, Patrick [1 ,2 ]
Meyer, Rodolphe [3 ]
机构
[1] Swiss Inst Bioinformat, SIB Text Min Grp, Geneva, Switzerland
[2] HES So HEG, Informat Sci, Geneva, Switzerland
[3] Univ Hospitals Geneva HUG, Informat Syst Dept, Geneva, Switzerland
来源
DIGITAL PERSONALIZED HEALTH AND MEDICINE | 2020年 / 270卷
关键词
Medical coding; machine learning; text mining;
D O I
10.3233/SHTI200173
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
The encoding of Electronic Medical Records is a complex and time-consuming task. We report on a machine learning model for proposing diagnoses and procedures codes, from a large realistic dataset of 245 000 electronic medical records at the University Hospitals of Geneva. Our study particularly focuses on the impact of training data quantity on the model's performances. We show that the performances of the models do not increase while encoded instances from previous years are exploited for learning data. Furthermore, supervised models are shown to be highly perishable: we show a potential drop in performances of around -10% per year. Consequently, great and constant care must be exercised for designing and updating the content of such knowledge bases exploited by machine learning.
引用
收藏
页码:312 / 316
页数:5
相关论文
共 50 条
  • [31] Machine Learning Based Text Mining in Electronic Health Records: Cardiovascular Patient Cases
    Sikorskiy, Sergey
    Metsker, Oleg
    Yakovlev, Alexey
    Kovalchuk, Sergey
    COMPUTATIONAL SCIENCE - ICCS 2018, PT III, 2018, 10862 : 818 - 824
  • [32] Approach to machine learning for extraction of real-world data variables from electronic health records
    Adamson, Blythe
    Waskom, Michael
    Blarre, Auriane
    Kelly, Jonathan
    Krismer, Konstantin
    Nemeth, Sheila
    Gippetti, James
    Ritten, John
    Harrison, Katherine
    Ho, George
    Linzmayer, Robin
    Bansal, Tarun
    Wilkinson, Samuel
    Amster, Guy
    Estola, Evan
    Benedum, Corey M.
    Fidyk, Erin
    Estevez, Melissa
    Shapiro, Will
    Cohen, Aaron B.
    FRONTIERS IN PHARMACOLOGY, 2023, 14
  • [33] Finding undiagnosed patients with hepatitis C virus: an application of machine learning to US ambulatory electronic medical records
    Rigg, John
    Doyle, Orla
    McDonogh, Niamh
    Leavitt, Nadea
    Ali, Rehan
    Son, Annie
    Kreter, Bruce
    BMJ HEALTH & CARE INFORMATICS, 2023, 30 (01)
  • [34] The incremental design of a machine learning framework for medical records processing
    Streiffer, Christopher
    Saini, Divya
    Whitehead, Gideon
    Daniel, Jency
    Garzon-Mrad, Carolina
    Kavanaugh, Laura
    Anyanwu, Emeka
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (10) : 2236 - 2245
  • [35] Identifying individuals at risk for weight gain using machine learning in electronic medical records from the United States
    Choong, Casey
    Xavier, Neena
    Falcon, Beverly
    Kan, Hong
    Lipkovich, Ilya
    Nowak, Callie
    Hoyt, Margaret
    Houle, Christy
    Kahan, Scott
    DIABETES OBESITY & METABOLISM, 2025,
  • [36] Generalizability of machine learning methods in detecting adverse drug events from clinical narratives in electronic medical records
    Zitu, Md Muntasir
    Zhang, Shijun
    Owen, Dwight H. H.
    Chiang, Chienwei
    Li, Lang
    FRONTIERS IN PHARMACOLOGY, 2023, 14
  • [37] EXTRACTION OF MEDICAL DATA FROM ELECTRONIC MEDICAL RECORDS USING NLP ALGORITHMS
    Gusev, Aleksandr V.
    Novitskiy, Roman E.
    Ivshin, Aleksandr A.
    Boldina, Juliia S.
    Shtykov, Aleksey S.
    Vasilev, Aleksey S.
    AD ALTA-JOURNAL OF INTERDISCIPLINARY RESEARCH, 2022, 12 (02): : 314 - 319
  • [38] Are more data always better? - Machine learning forecasting of algae based on long-term observations
    Beckmann, D. Atton
    Werther, M.
    Mackay, E. B.
    Spyrakos, E.
    Hunter, P.
    Jones, I. D.
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2025, 373
  • [39] Automatic Electronic Invoice Classification Using Machine Learning Models
    Bardelli, Chiara
    Rondinelli, Alessandro
    Vecchio, Ruggero
    Figini, Silvia
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2020, 2 (04): : 617 - 629
  • [40] Using Electronic Health Records and Machine Learning to Predict Postpartum Depression
    Wang, Shuojia
    Pathak, Jyotishman
    Zhang, Yiye
    MEDINFO 2019: HEALTH AND WELLBEING E-NETWORKS FOR ALL, 2019, 264 : 888 - 892