Explainable machine learning identifies a polygenic risk score as a key predictor of pancreatic cancer risk in the UK Biobank

被引:3
作者
Peduzzi, Giulia [1 ]
Felici, Alessio [1 ]
Pellungrini, Roberto
Campa, Daniele [1 ,2 ]
机构
[1] Univ Pisa, Dept Biol, Via Luca Ghini 13, I-56126 Pisa, Italy
[2] Scuola Normale Super Pisa, Classe Sci, Piazza Cavalieri 7, I-56126 Pisa, Italy
关键词
Pancreatic cancer; Risk prediction; Explainable artificial intelligence; Polygenic Risk Score; GENOME-WIDE ASSOCIATION; SUSCEPTIBILITY LOCI; BREAST-CANCER; VARIANTS; DISEASE; GENES; MODEL;
D O I
10.1016/j.dld.2024.11.010
中图分类号
R57 [消化系及腹部疾病];
学科分类号
摘要
Background: Predicting the risk of developing pancreatic ductal adenocarcinoma (PDAC) is of paramount importance, given its high mortality rate. Current PDAC risk prediction models rely on a limited number of variables, do not include genetics, and have a modest accuracy. Aim: This study aimed to develop an interpretable PDAC risk prediction model, based on machine learning (ML). Methods: Five ML models (Adaptive Boosting, eXtreme Gradient Boosting, CatBoost, Deep Forest and Random Forest) built on 56 exposome variables and a polygenic risk score (PRS) were tested in 654 PDAC cases and 1,308 controls of the UK Biobank. Additionally, SHapley Additive exPlanation (SHAP) and Global model Interpretation via the Recursive Partitioning (Girp) were employed to explain the models. Results: All models provided similar performance, but based on recall the best was CatBoost (77.10 %). SHAP highlighted age and the PRS as primary contributors across all models. Girp developed rules to discern cases from controls, identifying age, PRS, and pancreatitis in most of the rules. Conclusion: The predictive models tested have exhibited good performance, indicating their potential application in the clinical field in the near future, with the PRS playing a key role in identifying high-risk individuals as demonstrated by the explainers. (c) 2024 Published by Elsevier Ltd on behalf of Editrice Gastroenterologica Italiana S.r.l.
引用
收藏
页码:915 / 922
页数:8
相关论文
共 55 条
  • [1] Genome-wide association study identifies variants in the ABO locus associated with susceptibility to pancreatic cancer
    Amundadottir, Laufey
    Kraft, Peter
    Stolzenberg-Solomon, Rachael Z.
    Fuchs, Charles S.
    Petersen, Gloria M.
    Arslan, Alan A.
    Bueno-de-Mesquita, H. Bas
    Gross, Myron
    Helzlsouer, Kathy
    Jacobs, Eric J.
    LaCroix, Andrea
    Zheng, Wei
    Albanes, Demetrius
    Bamlet, William
    Berg, Christine D.
    Berrino, Franco
    Bingham, Sheila
    Buring, Julie E.
    Bracci, Paige M.
    Canzian, Federico
    Clavel-Chapelon, Francoise
    Clipp, Sandra
    Cotterchio, Michelle
    de Andrade, Mariza
    Duell, Eric J.
    Fox, John W., Jr.
    Gallinger, Steven
    Gaziano, J. Michael
    Giovannucci, Edward L.
    Goggins, Michael
    Gonzalez, Carlos A.
    Hallmans, Goran
    Hankinson, Susan E.
    Hassan, Manal
    Holly, Elizabeth A.
    Hunter, David J.
    Hutchinson, Amy
    Jackson, Rebecca
    Jacobs, Kevin B.
    Jenab, Mazda
    Kaaks, Rudolf
    Klein, Alison P.
    Kooperberg, Charles
    Kurtz, Robert C.
    Li, Donghui
    Lynch, Shannon M.
    Mandelson, Margaret
    McWilliams, Robert R.
    Mendelsohn, Julie B.
    Michaud, Dominique S.
    [J]. NATURE GENETICS, 2009, 41 (09) : 986 - U47
  • [2] Development and validation of a pancreatic cancer risk model for the general population using electronic health records: An observational study
    Appelbaum, Limor
    Cambronero, Jose P.
    Stevens, Jennifer P.
    Horng, Steven
    Pollick, Karla
    Silva, George
    Haneuse, Sebastien
    Piatkowski, Gail
    Benhaga, Nordine
    Duey, Stacey
    Stevenson, Mary A.
    Mamon, Harvey
    Kaplan, Irving D.
    Rinard, Martin C.
    [J]. EUROPEAN JOURNAL OF CANCER, 2021, 143 : 19 - 30
  • [3] Development of PancRISK, a urine biomarker-based risk score for stratified screening of pancreatic cancer patients
    Blyuss, Oleg
    Zaikin, Alexey
    Cherepanova, Valeriia
    Munblit, Daniel
    Kiseleva, Elena M.
    Prytomanova, Olga M.
    Duffy, Stephen W.
    Crnogorac-Jurcevic, Tatjana
    [J]. BRITISH JOURNAL OF CANCER, 2020, 122 (05) : 692 - 696
  • [4] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [5] The PANcreatic Disease ReseArch (PANDoRA) consortium: Ten years' experience of association studies to understand the genetic architecture of pancreatic cancer
    Campa, Daniele
    Gentiluomo, Manuel
    Stein, Angelika
    Aoki, Mateus Nobrega
    Oliverius, Martin
    Vodickova, Ludmila
    Jamroziak, Krzysztof
    Theodoropoulos, George
    Pasquali, Claudio
    Greenhalf, William
    Arcidiacono, Paolo Giorgio
    Uzunoglu, Faik
    Pezzilli, Raffaele
    Luchini, Claudio
    Puzzono, Marta
    Loos, Martin
    Giaccherini, Matteo
    Katzke, Verena
    Mambrini, Andrea
    Kiudeliene, Edita
    Federico, Kauffmann Emanuele
    Johansen, Julia
    Hussein, Tamas
    Mohelnikova-Duchonova, Beatrice
    van Eijck, Casper H. J.
    Brenner, Hermann
    Farinella, Riccardo
    Perez, Juan Sainz
    Lovecek, Martin
    Buechler, Markus W.
    Hlavac, Viktor
    Izbicki, Jakob R.
    Hackert, Thilo
    Chammas, Roger
    Zerbi, Alessandro
    Lawlor, Rita
    Felici, Alessio
    Goetz, Mara
    Capurso, Gabriele
    Ginocchi, Laura
    Gazouli, Maria
    Kupcinskas, Juozas
    Cavestro, Giulia Martina
    Vodicka, Pavel
    Moz, Stefania
    Neoptolemos, John P.
    Kunovsky, Lumir
    Bojesen, Stig E.
    Carrara, Silvia
    Gioffreda, Domenica
    [J]. CRITICAL REVIEWS IN ONCOLOGY HEMATOLOGY, 2023, 186
  • [6] Functional single nucleotide polymorphisms within the cyclin-dependent kinase inhibitor 2A/2B region affect pancreatic cancer risk
    Campa, Daniele
    Pastore, Manuela
    Gentiluomo, Manuel
    Talar-Wojnarowska, Renata
    Kupcinskas, Juozas
    Malecka-Panas, Ewa
    Neoptolemos, John P.
    Niesen, Willem
    Vodicka, Pavel
    Delle Fave, Gianfranco
    Bueno-de-Mesquita, H. Bas
    Gazouli, Maria
    Pacetti, Paola
    Di Leo, Milena
    Ito, Hidemi
    Klueter, Harald
    Soucek, Pavel
    Corbo, Vincenzo
    Yamao, Kenji
    Hosono, Satoyo
    Kaaks, Rudolf
    Vashist, Yogesh
    Gioffreda, Domenica
    Strobel, Oliver
    Shimizu, Yasuhiro
    Dijk, Frederike
    Andriulli, Angelo
    Ivanauskas, Audrius
    Bugert, Peter
    Tavano, Francesca
    Vodickova, Ludmila
    Zambon, Carlo Federico
    Lovecek, Martin
    Landi, Stefano
    Key, Timothy J.
    Boggi, Ugo
    Pezzilli, Raffaele
    Jamroziak, Krzysztof
    Mohelnikova-Duchonova, Beatrice
    Mambrini, Andrea
    Bambi, Franco
    Busch, Olivier
    Pazienza, Valerio
    Valente, Roberto
    Theodoropoulos, George E.
    Hackert, Thilo
    Capurso, Gabriele
    Cavestro, Giulia Martina
    Pasquali, Claudio
    Basso, Daniela
    [J]. ONCOTARGET, 2016, 7 (35) : 57011 - 57020
  • [7] TERT gene harbors multiple variants associated with pancreatic cancer susceptibility
    Campa, Daniele
    Rizzato, Cosmeri
    Stolzenberg-Solomon, Rachael
    Pacetti, Paola
    Vodicka, Pavel
    Cleary, Sean P.
    Capurso, Gabriele
    Bueno-de-Mesquita, H. B
    Werner, Jens
    Gazouli, Maria
    Butterbach, Katja
    Ivanauskas, Audrius
    Giese, Nathalia
    Petersen, Gloria M.
    Fogar, Paola
    Wang, Zhaoming
    Bassi, Claudio
    Ryska, Miroslav
    Theodoropoulos, George E.
    Kooperberg, Charles
    Li, Donghui
    Greenhalf, William
    Pasquali, Claudio
    Hackert, Thilo
    Fuchs, Charles S.
    Mohelnikova-Duchonova, Beatrice
    Sperti, Cosimo
    Funel, Niccola
    Dieffenbach, Aida Karina
    Wareham, Nicholas J.
    Buring, Julie
    Holcatova, Ivana
    Costello, Eithne
    Zambon, Carlo-Federico
    Kupcinskas, Juozas
    Risch, Harvey A.
    Kraft, Peter
    Bracci, Paige M.
    Pezzilli, Raffaele
    Olson, Sara H.
    Sesso, Howard D.
    Hartge, Patricia
    Strobel, Oliver
    Malecka-Panas, Ewa
    Visvanathan, Kala
    Arslan, Alan A.
    Pedrazzoli, Sergio
    Soucek, Pavel
    Gioffreda, Domenica
    Key, Timothy J.
    [J]. INTERNATIONAL JOURNAL OF CANCER, 2015, 137 (09) : 2175 - 2183
  • [8] A novel prediction model of the risk of pancreatic cancer among diabetes patients using multiple clinical data and machine learning
    Chen, Shih-Min
    Phuc, Phan Thanh
    Nguyen, Phung-Anh
    Burton, Whitney
    Lin, Shwu-Jiuan
    Lin, Weei-Chin
    Lu, Christine Y.
    Hsu, Min-Huei
    Cheng, Chi-Tsun
    Hsu, Jason C.
    [J]. CANCER MEDICINE, 2023, 12 (19): : 19987 - 19999
  • [9] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [10] Machine learning versus regression for prediction of sporadic pancreatic cancer
    Chen, Wansu
    Zhou, Botao
    Jeon, Christie Y.
    Xie, Fagen
    Lin, Yu-Chen
    Butler, Rebecca K.
    Zhou, Yichen
    Luong, Tiffany Q.
    Lustigova, Eva
    Pisegna, Joseph R.
    Wu, Bechien U.
    [J]. PANCREATOLOGY, 2023, 23 (04) : 396 - 402