Bias correction models for electronic health records data in the presence of non-random sampling

被引:0
|
作者
Kim, Jiyu [1 ]
Anthopolos, Rebecca [1 ]
Zhong, Judy [1 ]
机构
[1] NYU, NYU Grossman Sch Med, Dept Populat Hlth, 180 Madison Ave, New York, NY 10016 USA
基金
美国国家卫生研究院;
关键词
bias correction; EHRs; SNAR; social determinants of health; SELECTION MODELS; POPULATION; INFERENCE;
D O I
10.1093/biomtc/ujae014
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Electronic health records (EHRs) contain rich clinical information for millions of patients and are increasingly used for public health research. However, non-random inclusion of subjects in EHRs can result in selection bias, with factors such as demographics, socioeconomic status, healthcare referral patterns, and underlying health status playing a role. While this issue has been well documented, little work has been done to develop or apply bias-correction methods, often due to the fact that most of these factors are unavailable in EHRs. To address this gap, we propose a series of Heckman type bias correction methods by incorporating social determinants of health selection covariates to model the EHR non-random sampling probability. Through simulations under various settings, we demonstrate the effectiveness of our proposed method in correcting biases in both the association coefficient and the outcome mean. Our method augments the utility of EHRs for public health inferences, as we show by estimating the prevalence of cardiovascular disease and its correlation with risk factors in the New York City network of EHRs.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Non-random sampling measures the occurrence but not the strength of a textbook trophic cascade
    Macnulty, Daniel R.
    Brice, Elaine M.
    Larsen, Eric J.
    ECOLOGY LETTERS, 2024, 27 (01)
  • [42] Non-random sampling and association tests on realized returns and risk proxies
    Frank Ecker
    Jennifer Francis
    Per Olsson
    Katherine Schipper
    Review of Accounting Studies, 2021, 26 : 772 - 814
  • [43] Correction to: Exact semidefinite formulations for a class of (random and non-random) nonconvex quadratic programs
    Samuel Burer
    Yinyu Ye
    Mathematical Programming, 2021, 190 : 845 - 848
  • [44] Publication bias in clinical trials of electronic health records
    Vawdrey, David K.
    Hripcsak, George
    JOURNAL OF BIOMEDICAL INFORMATICS, 2013, 46 (01) : 139 - 141
  • [45] Weighted Matrix Completion From Non-Random, Non-Uniform Sampling Patterns
    Foucart, Simon
    Needell, Deanna
    Pathak, Reese
    Plan, Yaniv
    Wootters, Mary
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2021, 67 (02) : 1264 - 1290
  • [46] Accuracy of single- and multiple-trait REML evaluation of data including non-random missing records
    Persson, T
    Andersson, B
    SILVAE GENETICA, 2004, 53 (03) : 135 - 139
  • [47] CORRECTING BIAS IN EFFECTS OF EFFECTS OF PREDICTORS OF LONGITUDINAL CHANGE DUE TO NON-RANDOM MISSINGNESS USING AUXILIARY DATA
    Hall, C. B.
    Wang, C.
    Katz, M. J.
    Lipton, R. B.
    GERONTOLOGIST, 2013, 53 : 230 - 230
  • [48] Electronic Health Records and Data Quality
    Walji, Muhammad F.
    JOURNAL OF DENTAL EDUCATION, 2019, 83 (03) : 263 - 264
  • [49] Testing and correcting non-random selection bias: An application to censored medical cost
    Baser, O
    Bradley, C
    Gardiner, J
    Given, C
    VALUE IN HEALTH, 2004, 7 (03) : 303 - 303
  • [50] Computationally efficient methods for fitting mixed models to electronic health records data
    Rhodes, K. M.
    Turner, R. M.
    Payne, R. A.
    White, I. R.
    STATISTICS IN MEDICINE, 2018, 37 (29) : 4557 - 4570