Analysis of missing data in electronic health records of people with diabetes in primary care in Spain: A population-based cohort study

被引:0
|
作者
Quesada, Jose Antonio [1 ]
Orozco-Beltran, Domingo [1 ,2 ]
机构
[1] Miguel Hernandez Univ, Dept Clin Med, Crta Nacl 332 S-N, Alacant 03550, Spain
[2] Cabo Huertas Hlth Ctr, Dept Universal Hlth & Publ Hlth, Alicante 03540, Spain
关键词
Electronic health records; Missing values; Diabetes;
D O I
10.1016/j.ijmedinf.2024.105722
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Introduction: Researchers conducting studies based on electronic health records (EHRs) often have to deal with missing data. We aimed to analyze patterns of missing data in lipid profile, sociodemographic variables and risk factors contained in the EHRs of the CARDIABETES project and compare different strategies for addressing the issue. Methods: We conducted a retrospective cohort study of people with diabetes, based on EHRs in the Spanish Pharmacoepidemiological Research Database for Public Health Systems (BIFAP). Our response variable was major adverse cardiovascular events (MACE), including all-cause death and hospital admission for cerebrovascular disease or ischemic heart disease. We analyzed patterns of missing data, associations between missingness and MACE, and the effect of eliminating cases with missing data or imputing missing data. Results: Our total sample included 309,556 people with diabetes. The proportion of individuals with at least one missing value was 76.0%. Regarding diabetes control measures, 10.8% of records had missing glycated hemoglobin values, and 21.4% had missing basal blood glucose values. We observed a non-random pattern of association between missingness and MACE. The strategy of eliminating records with missing data greatly reduced the number of cases and statistical power, and altered the average participant characteristics and cumulative incidence of MACE. By imputing missing data, we were able to circumvent these problems. Conclusion: A considerable proportion of missing data was observed for variables such as fasting blood glucose and glycated hemoglobin, and also for other variables such as blood test parameters, BMI, and tobacco and alcohol use. The missing data show a non-random pattern and are associated with a higher incidence of MACE. The strategy of eliminating records with missing data greatly reduced the number of cases and statistical power. The recommended solution is to impute missing data with methods that take all the variables into account, such as MICE with PPM.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Polypharmacy in primary care: A population-based retrospective cohort study of electronic health records
    Woodcock, Thomas
    Lovett, Derryn
    Ihenetu, Gloria
    Novov, Vesselin
    Beaney, Thomas
    Armani, Keivan
    Quilley, Angela
    Majeed, Azeem
    Aylin, Paul
    PLOS ONE, 2024, 19 (09):
  • [2] Validity of using UK primary care electronic health records to study migration and health: a population-based cohort study
    Pathak, Neha
    Burns, Rachel
    Gonzalez-Izquierdo, Arturo
    Denaxas, Spiros
    Sonnenberg, Pam
    Hayward, Andrew
    Aldridge, Robert
    LANCET, 2019, 394 : 75 - 75
  • [3] Sepsis recording in primary care electronic health records, linked hospital episodes and mortality records: Population-based cohort study in England
    Rezel-Potts, Emma
    Gulliford, Martin C.
    PLOS ONE, 2020, 15 (12):
  • [4] COMPARISON OF SEPSIS RECORDING IN PRIMARY CARE ELECTRONIC HEALTH RECORDS AND LINKED HOSPITAL EPISODES AND MORTALITY DATA: POPULATION-BASED COHORT STUDY IN ENGLAND
    Rezel-Potts, E.
    Gulliford, M.
    JOURNAL OF EPIDEMIOLOGY AND COMMUNITY HEALTH, 2020, 74 : A22 - A22
  • [5] Contribution of infection to mortality in people with type 2 diabetes: a population-based cohort study using electronic records
    Carey, Iain M.
    Critchley, Julia A.
    Chaudhry, Umar A. R.
    DeWilde, Stephen
    Limb, Elizabeth S.
    Bowen, Liza
    Audi, Selma
    Cook, Derek G.
    Whincup, Peter H.
    Sattar, Naveed
    Panahloo, Arshia
    Harris, Tess
    LANCET REGIONAL HEALTH-EUROPE, 2025, 48
  • [6] Validity of UK electronic health records to study migrant health: a population-based cohort study
    Pathak, N.
    Patel, P.
    Mathur, R.
    Burns, R.
    Gonzalez-Izquierdo, A.
    Denaxas, S.
    Sonnenberg, P.
    Hayward, A.
    Aldridge, R.
    EUROPEAN JOURNAL OF PUBLIC HEALTH, 2020, 30
  • [7] Outcomes of COVID-19 Infection in People Previously Vaccinated Against Influenza: Population-Based Cohort Study Using Primary Health Care Electronic Records
    Giner-Soriano, Maria
    de Dios, Vanessa
    Ouchi, Dan
    Vilaplana-Carnerero, Carles
    Monteagudo, Monica
    Morros, Rosa
    JMIR PUBLIC HEALTH AND SURVEILLANCE, 2022, 8 (11):
  • [8] Childhood obesity trends from primary care electronic health records in England between 1994 and 2013: population-based cohort study
    van Jaarsveld, Cornelia H. M.
    Gulliford, Martin C.
    ARCHIVES OF DISEASE IN CHILDHOOD, 2015, 100 (03) : 214 - 219
  • [9] Primary Care Physician Volume and Quality of Diabetes Care A Population-Based Cohort Study
    Cheung, Andrew
    Stukel, Therese A.
    Alter, David A.
    Glazier, Richard H.
    Ling, Vicki
    Wang, Xuesong
    Shah, Baiju R.
    ANNALS OF INTERNAL MEDICINE, 2017, 166 (04) : 240 - +
  • [10] Identification of Early Onset Dementia in Population-Based Health Administrative Data: A Validation Study Using Primary Care Electronic Medical Records
    Jaakkimainen, Liisa
    Duchen, Raquel
    Lix, Lisa
    Al-Azazi, Saeed
    Yu, Bing
    Butt, Debra
    Park, Su-Bin
    Widdifield, Jessica
    JOURNAL OF ALZHEIMERS DISEASE, 2022, 89 (04) : 1463 - 1472