Importance of multi-modal approaches to effectively identify cataract cases from electronic health records

被引:93
|
作者
Peissig, Peggy L. [1 ]
Rasmussen, Luke V. [1 ,2 ]
Berg, Richard L. [1 ]
Linneman, James G. [1 ]
McCarty, Catherine A. [3 ,4 ]
Waudby, Carol [3 ]
Chen, Lin [5 ]
Denny, Joshua C. [6 ,7 ]
Wilke, Russell A.
Pathak, Jyotishman [8 ]
Carrell, David [9 ]
Kho, Abel N. [10 ]
Starren, Justin B. [2 ]
机构
[1] Marshfield Clin Res Fdn, Biomed Informat Res Ctr, Marshfield, WI 54449 USA
[2] Northwestern Univ, Dept Prevent Med, Feinberg Sch Med, Div Hlth & Biomed Informat, Chicago, IL 60611 USA
[3] Marshfield Clin Res Fdn, Ctr Human Genet, Marshfield, WI 54449 USA
[4] Essentia Inst Rural Hlth, Duluth, MN USA
[5] Marshfield Clin Fdn Med Res & Educ, Dept Ophthalmol, Marshfield, WI USA
[6] Vanderbilt Univ, Sch Med, Dept Biomed Informat, Nashville, TN 37212 USA
[7] Vanderbilt Univ, Sch Med, Dept Med, Nashville, TN 37212 USA
[8] Mayo Clin, Dept Hlth Sci Res, Rochester, MN USA
[9] Grp Hlth Res Inst, Seattle, WA USA
[10] Northwestern Univ, Dept Med, Feinberg Sch Med, Chicago, IL 60611 USA
关键词
GENOME-WIDE ASSOCIATION; MEDICAL-RECORDS; VISUAL IMPAIRMENT; POPULATION; PREVALENCE; ADULTS;
D O I
10.1136/amiajnl-2011-000456
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective There is increasing interest in using electronic health records (EHRs) to identify subjects for genomic association studies, due in part to the availability of large amounts of clinical data and the expected cost efficiencies of subject identification. We describe the construction and validation of an EHR-based algorithm to identify subjects with age-related cataracts. Materials and methods We used a multi-modal strategy consisting of structured database querying, natural language processing on free-text documents, and optical character recognition on scanned clinical images to identify cataract subjects and related cataract attributes. Extensive validation on 3657 subjects compared the multi-modal results to manual chart review. The algorithm was also implemented at participating electronic MEdical Records and GEnomics (eMERGE) institutions. Results An EHR-based cataract phenotyping algorithm was successfully developed and validated, resulting in positive predictive values (PPVs) >95%. The multi-modal approach increased the identification of cataract subject attributes by a factor of three compared to single-mode approaches while maintaining high PPV. Components of the cataract algorithm were successfully deployed at three other institutions with similar accuracy. Discussion A multi-modal strategy incorporating optical character recognition and natural language processing may increase the number of cases identified while maintaining similar PPVs. Such algorithms, however, require that the needed information be embedded within clinical documents. Conclusion We have demonstrated that algorithms to identify and characterize cataracts can be developed utilizing data collected via the EHR. These algorithms provide a high level of accuracy even when implemented across multiple EHRs and institutional boundaries.
引用
收藏
页码:225 / 234
页数:10
相关论文
共 50 条
  • [1] Learning Inter-Modal Correspondence and Phenotypes From Multi-Modal Electronic Health Records
    Yin, Kejing
    Cheung, William K.
    Fung, Benjamin C. M.
    Poon, Jonathan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (09) : 4328 - 4341
  • [2] Multi-Modal Fusion of Routine Care Electronic Health Records (EHR): A Scoping Review
    Ben-Miled, Zina
    Shebesh, Jacob A.
    Su, Jing
    Dexter, Paul R.
    Grout, Randall W.
    Boustani, Malaz A.
    INFORMATION, 2025, 16 (01)
  • [3] Graph and text multi-modal representation learning with momentum distillation on Electronic Health Records
    Cao, Yu
    Wang, Xu
    Wang, Qian
    Yuan, Zhong
    Shi, Yongguo
    Peng, Dezhong
    KNOWLEDGE-BASED SYSTEMS, 2024, 302
  • [4] Learning Missing Modal Electronic Health Records with Unified Multi-modal Data Embedding and Modality-Aware Attention
    Lee, Kwanhyung
    Lee, Soojeong
    Hahn, Sangchul
    Hyun, Heejung
    Choi, Edward
    Ahn, Byungeun
    Lee, Joohyung
    MACHINE LEARNING FOR HEALTHCARE CONFERENCE, VOL 219, 2023, 219
  • [5] EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images
    Bae, Seongsu
    Kyung, Daeun
    Ryu, Jaehee
    Cho, Eunbyeol
    Lee, Gyubok
    Kweon, Sunjun
    Oh, Jeongwoo
    Ji, Lei
    Chang, Eric I-Chao
    Kim, Tackeun
    Choi, Edward
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] Towards Automated Diagnosis with Attentive Multi-modal Learning Using Electronic Health Records and Chest X-Rays
    van Sonsbeek, Tom
    Worring, Marcel
    MULTIMODAL LEARNING FOR CLINICAL DECISION SUPPORT AND CLINICAL IMAGE-BASED PROCEDURES, ML-CDS 2020, CLIP 2020, 2020, 12445 : 106 - 114
  • [7] Current approaches to identify sections within clinical narratives from electronic health records: a systematic review
    Alexandra Pomares-Quimbaya
    Markus Kreuzthaler
    Stefan Schulz
    BMC Medical Research Methodology, 19
  • [8] Current approaches to identify sections within clinical narratives from electronic health records: a systematic review
    Pomares-Quimbaya, Alexandra
    Kreuzthaler, Markus
    Schulz, Stefan
    BMC MEDICAL RESEARCH METHODOLOGY, 2019, 19 (1) : 155
  • [9] How well can electronic health records from primary care identify Alzheimer's disease cases?
    Ponjoan, Anna
    Garre-Olmo, Josep
    Blanch, Jordi
    Fages, Ester
    Alves-Cabratosa, Lia
    Marti-Lluch, Ruth
    Comas-Cufi, Marc
    Parramon, Didac
    Garcia-Gil, Maria
    Ramos, Rafel
    CLINICAL EPIDEMIOLOGY, 2019, 11 : 509 - 518
  • [10] A hybrid model to identify fall occurrence from electronic health records
    Fu, Sunyang
    Thorsteinsdottir, Bjoerg
    Zhang, Xin
    Lopes, Guilherme S.
    Pagali, Sandeep R.
    LeBrasseur, Nathan K.
    Wen, Andrew
    Liu, Hongfang
    Rocca, Walter A.
    Olson, Janet E.
    St Sauver, Jennifer
    Sohn, Sunghwan
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2022, 162