Derivation and External Validation of Machine Learning-Based Model for Detection of Pancreatic Cancer

被引:18
作者
Chen, Wansu [1 ]
Zhou, Yichen [1 ]
Xie, Fagen [1 ]
Butler, Rebecca K. [1 ]
Jeon, Christie Y. [2 ]
Luong, Tiffany Q. [1 ]
Zhou, Botao [1 ]
Lin, Yu-Chen [2 ]
Lustigova, Eva [1 ]
Pisegna, Joseph R. [3 ,4 ,5 ]
Kim, Sungjin [2 ]
Wu, Bechien U. [6 ]
机构
[1] Kaiser Permanente Southern Calif, Dept Res & Evaluat, Pasadena, CA 91103 USA
[2] Cedars Sinai Med Ctr, Los Angeles, CA USA
[3] VA Greater Angeles Healthcare Syst, Div Gastroenterol & Hepatol, Los Angeles, CA USA
[4] UCLA, Dept Med, David Geffen Sch Med, Los Angeles, CA USA
[5] UCLA, Dept Human Genet, David Geffen Sch Med, Los Angeles, CA USA
[6] Southern Calif Permanente Med Grp, Dept Gastroenterol, Ctr Pancreat Care, Los Angeles Med Ctr, Los Angeles, CA USA
基金
美国国家卫生研究院;
关键词
RISK;
D O I
10.14309/ajg.0000000000002050
中图分类号
R57 [消化系及腹部疾病];
学科分类号
摘要
INTRODUCTION:There is currently no widely accepted approach to screening for pancreatic cancer (PC). We aimed to develop and validate a risk prediction model for pancreatic ductal adenocarcinoma (PDAC), the most common form of PC, across 2 health systems using electronic health records.METHODS:This retrospective cohort study consisted of patients aged 50-84 years having at least 1 clinic-based visit over a 10-year study period at Kaiser Permanente Southern California (model training, internal validation) and the Veterans Affairs (VA, external testing). Random survival forests models were built to identify the most relevant predictors from >500 variables and to predict risk of PDAC within 18 months of cohort entry.RESULTS:The Kaiser Permanente Southern California cohort consisted of 1.8 million patients (mean age 61.6) with 1,792 PDAC cases. The 18-month incidence rate of PDAC was 0.77 (95% confidence interval 0.73-0.80)/1,000 person-years. The final main model contained age, abdominal pain, weight change, HbA1c, and alanine transaminase change (c-index: mean = 0.77, SD = 0.02; calibration test: P value 0.4, SD 0.3). The final early detection model comprised the same features as those selected by the main model except for abdominal pain (c-index: 0.77 and SD 0.4; calibration test: P value 0.3 and SD 0.3). The VA testing cohort consisted of 2.7 million patients (mean age 66.1) with an 18-month incidence rate of 1.27 (1.23-1.30)/1,000 person-years. The recalibrated main and early detection models based on VA testing data sets achieved a mean c-index of 0.71 (SD 0.002) and 0.68 (SD 0.003), respectively.DISCUSSION:Using widely available parameters in electronic health records, we developed and externally validated parsimonious machine learning-based models for detection of PC. These models may be suitable for real-time clinical application.
引用
收藏
页码:157 / 167
页数:11
相关论文
共 21 条
[1]   A Clinical Prediction Model to Assess Risk for Pancreatic Cancer Among Patients With New-Onset Diabetes [J].
Boursi, Ben ;
Finkelman, Brian ;
Giantonio, Bruce J. ;
Haynes, Kevin ;
Rustgi, Anil K. ;
Rhim, Andrew D. ;
Mamtani, Ronac ;
Yang, Yu-Xiao .
GASTROENTEROLOGY, 2017, 152 (04) :840-+
[2]  
Chen W., 2019, PERM J, V23, P18
[3]   Validation of the Enriching New-Onset Diabetes for Pancreatic Cancer Model in a Diverse and Integrated Healthcare Setting [J].
Chen, Wansu ;
Butler, Rebecca K. ;
Lustigova, Eva ;
Chari, Suresh T. ;
Wu, Bechien U. .
DIGESTIVE DISEASES AND SCIENCES, 2021, 66 (01) :78-87
[4]   Tests of calibration and goodness-of-fit in the survival setting [J].
Demler, Olga V. ;
Paynter, Nina P. ;
Cook, Nancy R. .
STATISTICS IN MEDICINE, 2015, 34 (10) :1659-1680
[5]   Random Survival Forest in practice: a method for modelling complex metabolomics data in time to event analysis [J].
Dietrich, Stefan ;
Floegel, Anna ;
Troll, Martina ;
Kuehn, Tilman ;
Rathmann, Wolfgang ;
Peters, Anette ;
Sookthai, Disorn ;
von Bergen, Martin ;
Kaaks, Rudolf ;
Adamski, Jerzy ;
Prehn, Cornelia ;
Boeing, Heiner ;
Schulze, Matthias B. ;
Illig, Thomas ;
Pischon, Tobias ;
Knueppel, Sven ;
Wang-Sattler, Rui ;
Drogan, Dagmar .
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2016, 45 (05) :1406-1420
[6]   Insights From Advanced Analytics At The Veterans Health Administration [J].
Fihn, Stephan D. ;
Francis, Joseph ;
Clancy, Carolyn ;
Nielson, Christopher ;
Nelson, Karin ;
Rumsfeld, John ;
Cullen, Theresa ;
Bates, Jack ;
Graham, Gail L. .
HEALTH AFFAIRS, 2014, 33 (07) :1203-1211
[7]   Management of patients with increased risk for familial pancreatic cancer: updated recommendations from the International Cancer of the Pancreas Screening (CAPS) Consortium [J].
Goggins, Michael ;
Overbeek, Kasper Alexander ;
Brand, Randall ;
Syngal, Sapna ;
Del Chiaro, Marco ;
Bartsch, Detlef K. ;
Bassi, Claudio ;
Carrato, Alfredo ;
Farrell, James ;
Fishman, Elliot K. ;
Fockens, Paul ;
Gress, Thomas M. ;
Van Hooft, Jeanin E. ;
Hruban, R. H. ;
Kastrinos, Fay ;
Klein, Allison ;
Lennon, Anne Marie ;
Lucas, Aimee ;
Park, Walter ;
Rustgi, Anil ;
Simeone, Diane ;
Stoffel, Elena ;
Vasen, Hans F. A. ;
Cahen, Djuna L. ;
Canto, Marcia Irene ;
Bruno, Marco ;
Arcidiacono, Paolo Giorgio ;
Ashida, Reiko ;
Ausems, Margreet ;
Besselink, Marc ;
Biermann, Katharina ;
Bonsing, Bert ;
Brentnall, Teri ;
Chak, Amitabh ;
Early, Dayna ;
Fernandez-Del Castillo, Carloz ;
Frucht, Harold ;
Furukawa, Toru ;
Gallinger, Steven ;
Geurts, Jennifer ;
Koerkamp, Bas Groot ;
Hammel, Pascal ;
Hes, Frederik ;
Iglesias-Garcia, Julio ;
Kamel, Ihab ;
Kitano, Masayuki ;
Kloppel, Gunter ;
Krak, Nanda ;
Kurtz, Robert ;
Kwon, Richard .
GUT, 2020, 69 (01) :7-17
[8]  
Ishwaran H, randomForestSRC: Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)
[9]   RANDOM SURVIVAL FORESTS [J].
Ishwaran, Hemant ;
Kogalur, Udaya B. ;
Blackstone, Eugene H. ;
Lauer, Michael S. .
ANNALS OF APPLIED STATISTICS, 2008, 2 (03) :841-860
[10]   Genetic and Circulating Biomarker Data Improve Risk Prediction for Pancreatic Cancer in the General Population [J].
Kim, Jihye ;
Yuan, Chen ;
Babic, Ana ;
Bao, Ying ;
Clish, Clary B. ;
Pollak, Michael N. ;
Amundadottir, Laufey T. ;
Klein, Alison P. ;
Stolzenberg-Solomon, Rachael Z. ;
Pandharipande, Pari V. ;
Brais, Lauren K. ;
Welch, Marisa W. ;
Ng, Kimmie ;
Giovannucci, Edward L. ;
Sesso, Howard D. ;
Manson, Joann E. ;
Stampfer, Meir J. ;
Fuchs, Charles S. ;
Wolpin, Brian M. ;
Kraft, Peter .
CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2020, 29 (05) :999-1008