Toward a Computable Phenotype for Determining Eligibility of Lung Cancer Screening Using Electronic Health Records

被引:0
作者
Yang, Shuang [1 ]
Huang, Yu [1 ]
Lou, Xiwei [1 ]
Lyu, Tianchen [1 ,2 ]
Wei, Ruoqi [1 ]
Mehta, Hiren J. [3 ]
Wu, Yonghui [1 ,2 ]
Alvarado, Michelle [4 ]
Salloum, Ramzi G. [1 ]
Braithwaite, Dejana [5 ,6 ]
Huo, Jinhai [7 ]
Shih, Ya-Chen Tina [8 ]
Guo, Yi [1 ,2 ]
Bian, Jiang [1 ,2 ]
机构
[1] Univ Florida, Coll Med, Dept Hlth Outcomes & Biomed Informat, Gainesville, FL 32611 USA
[2] Univ Florida, Hlth Canc Ctr, Canc Informat Shared Resource, Gainesville, FL 32610 USA
[3] Univ Florida, Coll Med, Div Pulm Crit Care Sleep Med, Gainesville, FL USA
[4] Univ Florida, Dept Ind & Syst Engn, Gainesville, FL USA
[5] Univ Florida, Coll Publ Hlth & Hlth Profess, Dept Epidemiol, Gainesville, FL USA
[6] Univ Florida, Coll Med, Dept Aging & Geriatr Res, Gainesville, FL USA
[7] Bristol Myers Squibb, WW HEOR US Markets, Lawrenceville, NJ USA
[8] Univ Texas MD Anderson Canc Ctr Houston, Dept Hlth Serv Res, Sect Canc Econ & Policy, Houston, TX USA
来源
JCO CLINICAL CANCER INFORMATICS | 2025年 / 9卷
关键词
MORTALITY;
D O I
10.1200/CCI.24.00139
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
PURPOSELung cancer screening (LCS) has the potential to reduce mortality and detect lung cancer at its early stages, but the high false-positive rate associated with low-dose computed tomography (LDCT) for LCS acts as a barrier to its widespread adoption. This study aims to develop computable phenotype (CP) algorithms on the basis of electronic health records (EHRs) to identify individual's eligibility for LCS, thereby enhancing LCS utilization in real-world settings.MATERIALS AND METHODSThe study cohort included 5,778 individuals who underwent LDCT for LCS from 2012 to 2022, as recorded in the University of Florida Health Integrated Data Repository. CP rules derived from LCS guidelines were used to identify potential candidates, incorporating both structured EHR and clinical notes analyzed via natural language processing. We then conducted manual reviews of 453 randomly selected charts to refine and validate these rules, assessing CP performance using metrics, for example, F1 score, specificity, and sensitivity.RESULTSWe developed an optimal CP rule that integrates both structured and unstructured data, adhering to the US Preventive Services Task Force 2013 and 2020 guidelines. This rule focuses on age (55-80 years for 2013 and 50-80 years for 2020), smoking status (current, former, and others), and pack-years (>= 30 for 2013 and >= 20 for 2020), achieving F1 scores of 0.75 and 0.84 for the respective guidelines. Including unstructured data improved the F1 score performance by up to 9.2% for 2013 and 12.9% for 2020, compared with using structured data alone.CONCLUSIONOur findings underscore the critical need for improved documentation of smoking information in EHRs, demonstrate the value of artificial intelligence techniques in enhancing CP performance, and confirm the effectiveness of EHR-based CP in identifying LCS-eligible individuals. This supports its potential to aid clinical decision making and optimize patient care.
引用
收藏
页数:12
相关论文
共 33 条
  • [1] Reduced Lung-Cancer Mortality with Low-Dose Computed Tomographic Screening
    Aberle, Denise R.
    Adams, Amanda M.
    Berg, Christine D.
    Black, William C.
    Clapp, Jonathan D.
    Fagerstrom, Richard M.
    Gareen, Ilana F.
    Gatsonis, Constantine
    Marcus, Pamela M.
    Sicks, JoRean D.
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2011, 365 (05) : 395 - 409
  • [2] [Anonymous], 2024, Screening for Lung Cancer with Low Dose Computed Tomography (LDCT)
  • [3] [Anonymous], Real-World Evidence
  • [4] [Anonymous], 2015, Capturing Social and Behavioral Domains and Measures in Electronic Health Records: Phase 2
  • [5] Begnaud Abbie, 2016, Am Soc Clin Oncol Educ Book, V35, pe468, DOI 10.14694/EDBK_159195
  • [6] Lung Cancer Screening, towards a Multidimensional Approach: Why and How?
    Benzaquen, Jonathan
    Boutros, Jacques
    Marquette, Charles
    Delingette, Herve
    Hofman, Paul
    [J]. CANCERS, 2019, 11 (02):
  • [7] Bottorff JL, 2015, CAN FAM PHYSICIAN, V61, pE562
  • [8] Statistical methodology .1. Incorporating the prevalence of disease into the sample size calculation for sensitivity and specificity
    Buderer, NMF
    [J]. ACADEMIC EMERGENCY MEDICINE, 1996, 3 (09) : 895 - 900
  • [9] Comparison of Observed Harms and Expected Mortality Benefit for Persons in the Veterans Health Affairs Lung Cancer Screening Demonstration Project
    Caverly, Tanner J.
    Fagerlin, Angela
    Wiener, Renda Soylemez
    Slatore, Christopher G.
    Tanner, Nichole T.
    Yun, Shira
    Hayward, Rodney
    [J]. JAMA INTERNAL MEDICINE, 2018, 178 (03) : 426 - 428
  • [10] Estimating the number of quit attempts it takes to quit smoking successfully in a longitudinal cohort of smokers
    Chaiton, Michael
    Diemert, Lori
    Cohen, Joanna E.
    Bondy, Susan J.
    Selby, Peter
    Philipneri, Anne
    Schwartz, Robert
    [J]. BMJ OPEN, 2016, 6 (06):