Genes selection using deep learning and explainable artificial intelligence for chronic lymphocytic leukemia predicting the need and time to therapy

被引:5
作者
Morabito, Fortunato [1 ]
Adornetto, Carlo [2 ]
Monti, Paola [3 ]
Amaro, Adriana [4 ]
Reggiani, Francesco [4 ]
Colombo, Monica [5 ]
Rodriguez-Aldana, Yissel [2 ]
Tripepi, Giovanni [6 ]
D'Arrigo, Graziella [6 ]
Vener, Claudia [7 ]
Torricelli, Federica [8 ]
Rossi, Teresa [8 ]
Neri, Antonino [9 ]
Ferrarini, Manlio [10 ]
Cutrona, Giovanna [5 ]
Gentile, Massimo [11 ,12 ]
Greco, Gianluigi [2 ]
机构
[1] A Sforza Foundat, Biotechnol Res Unit, Cosenza, Italy
[2] Univ Calabria, Dept Math & Comp Sci, Cosenza, Italy
[3] Osped Policlin San Martino, Mutagenesis & Canc Prevent Unit, Ist Ricovero & Cura Carattere Sci IRCCS, Genoa, Italy
[4] Osped Policlin San Martino, Tumor Epigenet Unit, Ist Ricovero & Cura Carattere Sci IRCCS, Genoa, Italy
[5] Osped Policlin San Martino, Mol Pathol Unit, Ist Ricovero & Cura Carattere Sci IRCCS, Genoa, Italy
[6] CNR, Consiglio Nazl Ric CNR, Ist Fisiol Clin, Reggio Di Calabria, Italy
[7] Univ Milan, Dept Oncol & Hematooncol, Milan, Italy
[8] Ist Ricovero & Cura Crabtree Sci USL IRCCS Reggio, Azienda Unita Sanit Locale, Lab Translat Res, Reggio Emilia, Italy
[9] Ist Ricovero & Cura Carattere Sci USL IRCCS Reggio, Azienda Unita Sanit Locale, Sci Directorate, Reggio Emilia, Italy
[10] Osped Policlin San Martino, Unita Operar UO Mol Pathol, Ist Ricovero & Cura Carattere Sci IRCCS, Genoa, Italy
[11] Aienda Osped AO Cosenza, Dept Oncohematol, Hematol Unit, Cosenza, Italy
[12] Univ Calabria, Dept Pharm & Hlth & Nutr Sci, Cosenza, Italy
来源
FRONTIERS IN ONCOLOGY | 2023年 / 13卷
关键词
chronic lymphocytic leukemia; gene expression profile; deep learning; explainable artificial intelligence; feature selection; GROWTH-FACTOR-I; B-CELLS; EXPRESSION SIGNATURE; PROGNOSTIC INDEX; CLL-IPI; SURVIVAL; RECEPTOR; PATHWAY; RNA; PROGRESSION;
D O I
10.3389/fonc.2023.1198992
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Analyzing gene expression profiles (GEP) through artificial intelligence provides meaningful insight into cancer disease. This study introduces DeepSHAP Autoencoder Filter for Genes Selection (DSAF-GS), a novel deep learning and explainable artificial intelligence-based approach for feature selection in genomics-scale data. DSAF-GS exploits the autoencoder's reconstruction capabilities without changing the original feature space, enhancing the interpretation of the results. Explainable artificial intelligence is then used to select the informative genes for chronic lymphocytic leukemia prognosis of 217 cases from a GEP database comprising roughly 20,000 genes. The model for prognosis prediction achieved an accuracy of 86.4%, a sensitivity of 85.0%, and a specificity of 87.5%. According to the proposed approach, predictions were strongly influenced by CEACAM19 and PIGP, moderately influenced by MKL1 and GNE, and poorly influenced by other genes. The 10 most influential genes were selected for further analysis. Among them, FADD, FIBP, FIBP, GNE, IGF1R, MKL1, PIGP, and SLC39A6 were identified in the Reactome pathway database as involved in signal transduction, transcription, protein metabolism, immune system, cell cycle, and apoptosis. Moreover, according to the network model of the 3D protein-protein interaction (PPI) explored using the NetworkAnalyst tool, FADD, FIBP, IGF1R, QTRT1, GNE, SLC39A6, and MKL1 appear coupled into a complex network. Finally, all 10 selected genes showed a predictive power on time to first treatment (TTFT) in univariate analyses on a basic prognosticmodel including IGHV mutational status, del(11q) and del(17p), NOTCH1mutations, beta 2-microglobulin, Rai stage, and B-lymphocytosis known to predict TTFT in CLL. However, only IGF1R [hazard ratio (HR) 1.41, 95% CI 1.08-1.84, P=0.013), COL28A1 (HR 0.32, 95% CI 0.10-0.97, P=0.045), and QTRT1 (HR 7.73, 95% CI 2.48-24.04, P<0.001) genes were significantly associated with TTFT in multivariable analyses when combined with the prognostic factors of the basic model, ultimately increasing the Harrell's c-index and the explained variation to 78.6% (versus 76.5% of the basic prognostic model) and 52.6% (versus 42.2% of the basic prognostic model), respectively. Also, the goodness of model fit was enhanced (chi(2) = 20.1, P=0.002), indicating its improved performance above the basic prognostic model. In conclusion, DSAF-GS identified a group of significant genes for CLL prognosis, suggesting future directions for bio-molecular research.
引用
收藏
页数:17
相关论文
共 90 条
  • [21] A survey of neural network-based cancer prediction models from microarray data
    Daoud, Maisa
    Mayo, Michael
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2019, 97 : 204 - 214
  • [22] sigFeature: Novel Significant Feature Selection Method for Classification of Gene Expression Data Using Support Vector Machine and t Statistic
    Das, Pijush
    Roychowdhury, Anirban
    Das, Subhadeep
    Roychoudhury, Susanta
    Tripathy, Sucheta
    [J]. FRONTIERS IN GENETICS, 2020, 11
  • [23] Chronic lymphocytic leukemia: A prognostic model comprising only two biomarkers (IGHV mutational status and FISH cytogenetics) separates patients with different outcome and simplifies the CLL-IPI
    Delgado, Julio
    Doubek, Michael
    Baumann, Tycho
    Kotaskova, Jana
    Molica, Stefano
    Mozas, Pablo
    Rivas-Delgado, Alfredo
    Morabito, Fortunato
    Pospisilova, Sarka
    Montserrat, Emili
    [J]. AMERICAN JOURNAL OF HEMATOLOGY, 2017, 92 (04) : 375 - 380
  • [24] Genomic aberrations and survival in chronic lymphocytic leukemia.
    Döhner, H
    Stilgenbauer, S
    Benner, A
    Leupolt, E
    Kröber, A
    Bullinger, L
    Döhner, K
    Bentz, M
    Lichter, P
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2000, 343 (26) : 1910 - 1916
  • [25] Reactome pathway analysis: a high-performance in-memory approach
    Fabregat, Antonio
    Sidiropoulos, Konstantinos
    Viteri, Guilherme
    Forner, Oscar
    Marin-Garcia, Pablo
    Arnau, Vicente
    D'Eustachio, Peter
    Stein, Lincoln
    Hermjakob, Henning
    [J]. BMC BIOINFORMATICS, 2017, 18
  • [26] Molecular and transcriptional characterization of 17p loss in B-cell chronic lymphocytic leukemia
    Fabris, Sonia
    Mosca, Laura
    Todoerti, Katia
    Cutrona, Giovanna
    Lionetti, Marta
    Intini, Daniela
    Matis, Serena
    Colombo, Monica
    Agnelli, Luca
    Gentile, Massimo
    Spriano, Mauro
    Callea, Vincenzo
    Festini, Gianluca
    Molica, Stefano
    Deliliers, Giorgio Lambertenghi
    Morabito, Fortunato
    Ferrarini, Manlio
    Neri, Antonino
    [J]. GENES CHROMOSOMES & CANCER, 2008, 47 (09) : 781 - 793
  • [27] Chronic lymphocytic leukemia B cells express restricted sets of mutated and unmutated antigen receptors
    Fais, F
    Ghiotto, F
    Hashimoto, S
    Sellars, B
    Valetto, A
    Allen, SL
    Schulman, P
    Vinciguerra, VP
    Rai, K
    Rassenti, LZ
    Kipps, TJ
    Dighiero, G
    Schroeder, HW
    Ferrarini, M
    Chiorazzi, N
    [J]. JOURNAL OF CLINICAL INVESTIGATION, 1998, 102 (08) : 1515 - 1525
  • [28] The Unreasonable Effectiveness of Convolutional Neural Networks in Population Genetic Inference
    Flagel, Lex
    Brandvain, Yaniv
    Schrider, Daniel R.
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2019, 36 (02) : 220 - 238
  • [29] FLUCKIGER AC, 1992, BLOOD, V80, P3173
  • [30] Comparison between the CLL-IPI and the Barcelona-Brno prognostic model: Analysis of 1299 newly diagnosed cases
    Gentile, Massimo
    Shanafelt, Tait D.
    Mauro, Francesca R.
    Laurenti, Luca
    Rossi, Davide
    Molica, Stefano
    Vincelli, Iolanda
    Cutrona, Giovanna
    Uccello, Giuseppina
    Pepe, Sara
    Vigna, Ernesto
    Tripepi, Giovanni
    Chaffee, Kari G.
    Parikh, Sameer A.
    Bossio, Sabrina
    Recchia, Anna Grazia
    Innocenti, Idanna
    Pasquale, Raffaella
    Neri, Antonino
    Ferrarini, Manlio
    Gaidano, Gianluca
    Foa, Robin
    Morabito, Fortunato
    [J]. AMERICAN JOURNAL OF HEMATOLOGY, 2018, 93 (02) : E35 - E37