Identifying potential biases in code sequences in primary care electronic healthcare records: a retrospective cohort study of the determinants of code frequency

被引:5
作者
Beaney, Thomas [1 ,2 ]
Clarke, Jonathan [2 ]
Salman, David [1 ,3 ]
Woodcock, Thomas [1 ]
Majeed, Azeem [1 ]
Barahona, Mauricio [2 ]
Aylin, Paul [1 ]
机构
[1] Imperial Coll London, Dept Primary Care & Publ Hlth, London, England
[2] Imperial Coll London, Dept Math, London, England
[3] Imperial Coll London, MSk Lab, London, England
来源
BMJ OPEN | 2023年 / 13卷 / 09期
基金
英国惠康基金; 英国工程与自然科学研究理事会;
关键词
epidemiology; primary health care; health informatics; statistics & research methods; ENGLAND;
D O I
10.1136/bmjopen-2023-072884
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Objectives To determine whether the frequency of diagnostic codes for long-term conditions (LTCs) in primary care electronic healthcare records (EHRs) is associated with (1) disease coding incentives, (2) General Practice (GP), (3) patient sociodemographic characteristics and (4) calendar year of diagnosis.Design Retrospective cohort study.Setting GPs in England from 2015 to 2022 contributing to the Clinical Practice Research Datalink Aurum dataset.Participants All patients registered to a GP with at least one incident LTC diagnosed between 1 January 2015 and 31 December 2019.Primary and secondary outcome measures The number of diagnostic codes for an LTC in (1) the first and (2) the second year following diagnosis, stratified by inclusion in the Quality and Outcomes Framework (QOF) financial incentive programme.Results 3 113 724 patients were included, with 7 723 365 incident LTCs. Conditions included in QOF had higher rates of annual coding than conditions not included in QOF (1.03 vs 0.32 per year, p<0.0001). There was significant variation in code frequency by GP which was not explained by patient sociodemographics. We found significant associations with patient sociodemographics, with a trend towards higher coding rates in people living in areas of higher deprivation for both QOF and non-QOF conditions. Code frequency was lower for conditions with follow-up time in 2020, associated with the onset of the COVID-19 pandemic.Conclusions The frequency of diagnostic codes for newly diagnosed LTCs is influenced by factors including patient sociodemographics, disease inclusion in QOF, GP practice and the impact of the COVID-19 pandemic. Natural language processing or other methods using temporally ordered code sequences should account for these factors to minimise potential bias.
引用
收藏
页数:11
相关论文
共 32 条
  • [1] From free text to clusters of content in health records: an unsupervised graph partitioning approach
    Altuncu, M. Tarik
    Mayer, Erik
    Yaliraki, Sophia N.
    Barahona, Mauricio
    [J]. APPLIED NETWORK SCIENCE, 2019, 4 (01) : 1 - 23
  • [2] Area-level deprivation and geographic factors influencing utilisation of General Practitioner services
    Barlow, Peter
    Mohan, Gretta
    Nolan, Anne
    Lyons, Sean
    [J]. SSM-POPULATION HEALTH, 2021, 15
  • [3] Beech J., 2020, GP FUNDING CONTRACTS
  • [4] Using electronic health record data for clinical research: a quick guide
    Bots, Sophie H.
    Groenwold, Rolf H. H.
    Dekkers, Olaf M.
    [J]. EUROPEAN JOURNAL OF ENDOCRINOLOGY, 2022, 186 (04) : E1 - E6
  • [5] Multi-layer Representation Learning for Medical Concepts
    Choi, Edward
    Bahadori, Mohammad Taha
    Searles, Elizabeth
    Coffey, Catherine
    Thompson, Michael
    Bost, James
    Tejedor-Sojo, Javier
    Sun, Jimeng
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1495 - 1504
  • [6] de Jong J., 2006, Morbidity, performance and quality in primary care
  • [7] de Lusignan Simon, 2004, Inform Prim Care, V12, P147
  • [9] Dunn PK, 1996, J COMPUT GRAPH STAT, V5, P236, DOI [DOI 10.1080/10618600.1996.10474708, DOI 10.2307/1390802]
  • [10] A comparison of residual diagnosis tools for diagnosing regression models for count data
    Feng, Cindy
    Li, Longhai
    Sadeghpour, Alireza
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2020, 20 (01)