Prediction of Gastrointestinal Tract Cancers Using Longitudinal Electronic Health Record Data

被引:5
作者
Read, Andrew J. J. [1 ,2 ,3 ]
Zhou, Wenjing [4 ]
Saini, Sameer D. D. [1 ,2 ,3 ,5 ]
Zhu, Ji [3 ,4 ]
Waljee, Akbar K. K. [1 ,2 ,3 ,5 ]
机构
[1] Univ Michigan, Dept Internal Med, Div Gastroenterol & Hepatol, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Inst Healthcare Policy & Innovat, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Michigan Integrated Ctr Hlth Analyt & Med Predict, Ann Arbor, MI 48109 USA
[4] Univ Michigan, Dept Stat, Ann Arbor, MI 48109 USA
[5] VA HSR&D Ctr Clin Management Res, Ann Arbor, MI 48105 USA
关键词
gastrointestinal cancers; prediction model; machine learning; COLORECTAL-CANCER; RISK-FACTORS; GASTRIC-CANCER; DIAGNOSIS; PROGNOSIS; MODEL;
D O I
10.3390/cancers15051399
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Simple Summary Cancers of the gastrointestinal tract-including the esophagus, stomach, and intestines-are often diagnosed at an advanced stage, when curative treatments are rare. These cancers can all cause gastrointestinal bleeding, but this often occurs gradually and may be unnoticed by patients. Changes in routine laboratory parameters such as the complete blood count may be able to show these subtle changes prior to clinical presentation or the development of iron deficiency anemia. The aim of our study was to develop models for the prediction of luminal gastrointestinal tract cancers (esophageal, gastric, small bowel, colorectal, anal) using data routinely available within an electronic health record, in a retrospective cohort from an academic medical center. The cohort included 148,158 individuals, with 1025 gastrointestinal tract cancers. We found that longitudinal prediction models using the complete blood count outperformed a single timepoint logistic model for 3-year cancer prediction. Background: Luminal gastrointestinal (GI) tract cancers, including esophageal, gastric, small bowel, colorectal, and anal cancers, are often diagnosed at late stages. These tumors can cause gradual GI bleeding, which may be unrecognized but detectable by subtle laboratory changes. Our aim was to develop models to predict luminal GI tract cancers using laboratory studies and patient characteristics using logistic regression and random forest machine learning methods. Methods: The study was a single-center, retrospective cohort at an academic medical center, with enrollment between 2004-2013 and with follow-up until 2018, who had at least two complete blood counts (CBCs). The primary outcome was the diagnosis of GI tract cancer. Prediction models were developed using multivariable single timepoint logistic regression, longitudinal logistic regression, and random forest machine learning. Results: The cohort included 148,158 individuals, with 1025 GI tract cancers. For 3-year prediction of GI tract cancers, the longitudinal random forest model performed the best, with an area under the receiver operator curve (AuROC) of 0.750 (95% CI 0.729-0.771) and Brier score of 0.116, compared to the longitudinal logistic regression model, with an AuROC of 0.735 (95% CI 0.713-0.757) and Brier score of 0.205. Conclusions: Prediction models incorporating longitudinal features of the CBC outperformed the single timepoint logistic regression models at 3-years, with a trend toward improved accuracy of prediction using a random forest machine learning model compared to a longitudinal logistic regression model.
引用
收藏
页数:14
相关论文
共 50 条
[1]   External Validation of Postpartum Hemorrhage Prediction Models Using Electronic Health Record Data [J].
Meyer, Sean R. ;
Carver, Alissa ;
Joo, Hyeon ;
Venkatesh, Kartik K. ;
Jelovsek, J. Eric ;
Klumpner, Thomas T. ;
Singh, Karandeep .
AMERICAN JOURNAL OF PERINATOLOGY, 2024, 41 (05) :598-605
[2]   Clinical Prediction Models for Hospital-Induced Delirium Using Structured and Unstructured Electronic Health Record Data: Protocol for a Development and Validation Study [J].
Ser, Sarah E. ;
Shear, Kristen ;
Snigurska, Urszula A. ;
Prosperi, Mattia ;
Wu, Yonghui ;
Magoc, Tanja ;
Bjarnadottir, Ragnhildur, I ;
Lucero, Robert J. .
JMIR RESEARCH PROTOCOLS, 2023, 12
[3]   Postoperative delirium prediction using machine learning models and preoperative electronic health record data [J].
Bishara, Andrew ;
Chiu, Catherine ;
Whitlock, Elizabeth L. ;
Douglas, Vanja C. ;
Lee, Sei ;
Butte, Atul J. ;
Leung, Jacqueline M. ;
Donovan, Anne L. .
BMC ANESTHESIOLOGY, 2022, 22 (01)
[4]   Utilizing timestamps of longitudinal electronic health record data to classify clinical deterioration events [J].
Fu, Li-Heng ;
Knaplund, Chris ;
Cato, Kenrick ;
Perotte, Adler ;
Kang, Min-Jeoung ;
Dykes, Patricia C. ;
Albers, David ;
Rossetti, Sarah Collins .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (09) :1955-1963
[5]   Improved Cardiovascular Risk Prediction Using Nonparametric Regression and Electronic Health Record Data [J].
Kennedy, Edward H. ;
Wiitala, Wyndy L. ;
Hayward, Rodney A. ;
Sussman, Jeremy B. .
MEDICAL CARE, 2013, 51 (03) :251-258
[6]   Preoperative Prediction of Postoperative Infections Using Machine Learning and Electronic Health Record Data [J].
Zhuang, Yaxu ;
Dyas, Adam ;
Meguid, Robert A. ;
Henderson, William G. ;
Bronsert, Michael ;
Madsen, Helen ;
Colborn, Kathryn L. .
ANNALS OF SURGERY, 2024, 279 (04) :720-726
[7]   Prediction of Cancer Symptom Trajectory Using Longitudinal Electronic Health Record Data and Long Short-Term Memory Neural Network [J].
Chae, Sena ;
Street, W. Nick ;
Ramaraju, Naveenkumar ;
Gilbertson-White, Stephanie .
JCO CLINICAL CANCER INFORMATICS, 2024, 8
[8]   Continuous prediction for tumor mutation burden based on transcriptional data in gastrointestinal cancers [J].
Hu, Beibei ;
Yin, Guohui ;
Zhu, Jialin ;
Bai, Yi ;
Sun, Xuren .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
[9]   Using Machine Learning to Identify Health Outcomes from Electronic Health Record Data [J].
Wong, Jenna ;
Murray Horwitz, Mara ;
Zhou, Li ;
Toh, Sengwee .
CURRENT EPIDEMIOLOGY REPORTS, 2018, 5 (04) :331-342
[10]   Postoperative delirium prediction using machine learning models and preoperative electronic health record data [J].
Andrew Bishara ;
Catherine Chiu ;
Elizabeth L. Whitlock ;
Vanja C. Douglas ;
Sei Lee ;
Atul J. Butte ;
Jacqueline M. Leung ;
Anne L. Donovan .
BMC Anesthesiology, 22