Bias correction models for electronic health records data in the presence of non-random sampling

被引:0
|
作者
Kim, Jiyu [1 ]
Anthopolos, Rebecca [1 ]
Zhong, Judy [1 ]
机构
[1] NYU, NYU Grossman Sch Med, Dept Populat Hlth, 180 Madison Ave, New York, NY 10016 USA
基金
美国国家卫生研究院;
关键词
bias correction; EHRs; SNAR; social determinants of health; SELECTION MODELS; POPULATION; INFERENCE;
D O I
10.1093/biomtc/ujae014
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Electronic health records (EHRs) contain rich clinical information for millions of patients and are increasingly used for public health research. However, non-random inclusion of subjects in EHRs can result in selection bias, with factors such as demographics, socioeconomic status, healthcare referral patterns, and underlying health status playing a role. While this issue has been well documented, little work has been done to develop or apply bias-correction methods, often due to the fact that most of these factors are unavailable in EHRs. To address this gap, we propose a series of Heckman type bias correction methods by incorporating social determinants of health selection covariates to model the EHR non-random sampling probability. Through simulations under various settings, we demonstrate the effectiveness of our proposed method in correcting biases in both the association coefficient and the outcome mean. Our method augments the utility of EHRs for public health inferences, as we show by estimating the prevalence of cardiovascular disease and its correlation with risk factors in the New York City network of EHRs.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Stratified split sampling of electronic health records
    Huo, Tianyao
    Glueck, Deborah H.
    Shenkman, Elizabeth A.
    Muller, Keith E.
    BMC MEDICAL RESEARCH METHODOLOGY, 2023, 23 (01)
  • [32] Non-random decay of chordate characters causes bias in fossil interpretation
    Sansom, Robert S.
    Gabbott, Sarah E.
    Purnell, Mark A.
    NATURE, 2010, 463 (7282) : 797 - 800
  • [33] Non-random decay of chordate characters causes bias in fossil interpretation
    Robert S. Sansom
    Sarah E. Gabbott
    Mark A. Purnell
    Nature, 2010, 463 : 797 - 800
  • [34] 1-D random landscapes and non-random data series
    Fink, T. M. A.
    Willbrand, K.
    Brown, F. C. S.
    EPL, 2007, 79 (03)
  • [35] Probabilistic Matrix Factorization with Non-random Missing Data
    Hernandez-Lobato, Jose Miguel
    Houlsby, Neil
    Ghahramani, Zoubin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 1512 - 1520
  • [36] Accounting for non-random samples with distance sampling to estimate population density
    Diefenbach, Duane R.
    Trowbridge, Jacob
    Van Buskirk, Amanda
    Mcconnell, Tess
    Lamp, Kevin
    Marques, Tiago A.
    Walter, W. David
    Wallingford, Bret D.
    Rosenberry, Christopher S.
    JOURNAL OF APPLIED ECOLOGY, 2025,
  • [37] Swarm Plot: Data Redistribution in Non-Random Technique
    Idrus, Zainura
    Rusli, Fatin S.
    Idrus, Zanariah
    Nazri, Muhammad Aqil Mohd
    Al-zebari, Adel
    Talib, Noor Hasnita Abdul
    6TH IEEE INTERNATIONAL CONFERENCE ON RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (ICRAIE), 2021,
  • [38] Assessing the non-random sampling effects of subject attrition in longitudinal research
    Goodman, JS
    Blum, TC
    JOURNAL OF MANAGEMENT, 1996, 22 (04) : 627 - 652
  • [39] Non-random sampling and association tests on realized returns and risk proxies
    Ecker, Frank
    Francis, Jennifer
    Olsson, Per
    Schipper, Katherine
    REVIEW OF ACCOUNTING STUDIES, 2021, 26 (02) : 772 - 814
  • [40] Evidence for non-random sampling in randomised, controlled trials by Yuhji Saitoh
    Carlisle, J. B.
    Loadsman, J. A.
    ANAESTHESIA, 2017, 72 (01) : 17 - 27