Learning to Train and to Explain a Deep Survival Model with Large-Scale Ovarian Cancer Transcriptomic Data

被引:0
|
作者
Menand, Elena Spirina [1 ,2 ]
De Vries-Brilland, Manon [2 ,3 ]
Tessier, Leslie [2 ]
Dauve, Jonathan [2 ]
Campone, Mario [4 ,5 ]
Verriele, Veronique [6 ]
Jrad, Nisrine [1 ]
Marion, Jean-Marie [1 ]
Chauvet, Pierre [1 ]
Passot, Christophe [2 ]
Morel, Alain [2 ,5 ]
机构
[1] Univ Angers, Lab Angevin Rech Ingn Syst EA7315, F-49035 Angers, France
[2] Inst Cancerol Ouest Nantes Angers, Unite Genom Fonct, F-49055 Angers, France
[3] Inst Cancerol Ouest Nantes Angers, F-49000 Angers, France
[4] Inst Cancerol Ouest Nantes Angers, F-49000 Angers, France
[5] Nantes Univ, Univ Angers, CNRS, Inserm,CRCI2NA,SFR ICAT, F-49000 Angers, France
[6] Inst Cancerol Ouest Nantes Angers, Dept Anat & Cytol Pathol, F-49055 Angers, France
关键词
TCGA; ovarian cancer; RNA-seq; survival analysis; deep learning; molecular pathways; SIGNATURES;
D O I
10.3390/biomedicines12122881
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background/Objectives: Ovarian cancer is a complex disease with poor outcomes that affects women worldwide. The lack of successful therapeutic options for this malignancy has led to the need to identify novel biomarkers for patient stratification. Here, we aim to develop the outcome predictors based on the gene expression data as they may serve to identify categories of patients who are more likely to respond to certain therapies. Methods: We used The Cancer Genome Atlas (TCGA) ovarian cancer transcriptomic data from 372 patients and approximately 16,600 genes to train and evaluate the deep learning survival models. In addition, we collected an in-house validation dataset of 12 patients to assess the performance of the trained survival models for their direct use in clinical practice. Despite deceptive generalization capabilities, we demonstrated how our model can be interpreted to uncover biological processes associated with survival. We calculated the contributions of the input genes to the output of the best trained model and derived the corresponding molecular pathways. Results: These pathways allowed us to stratify the TCGA patients into high-risk and low-risk groups (p-value 0.025). We validated the stratification ability of the identified pathways on the in-house dataset consisting of 12 patients (p-value 0.229) and on the external clinical and molecular dataset consisting of 274 patients (p-value 0.006). Conclusions: The deep learning-based models for survival prediction with RNA-seq data could be used to detect and interpret the gene-sets associated with survival in ovarian cancer patients and open a new avenue for future research.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] A large-scale evaluation framework for EEG deep learning architectures
    Heilmeyer, Felix A.
    Schirrmeister, Robin T.
    Fiederer, Lukas D. J.
    Voelker, Martin
    Behncke, Joos
    Ball, Tonio
    2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 1039 - 1045
  • [32] Deep learning model with low-dimensional random projection for large-scale image search
    Alzu'bi, Ahmad
    Abuarqoub, Abdelrahman
    ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2020, 23 (04): : 911 - 920
  • [33] A Hybrid Deep Learning Model for Predicting Depression Symptoms From Large-Scale Textual Dataset
    Almutairi, Sulaiman
    Abohashrh, Mohammed
    Razzaq, Hasanain Hayder
    Zulqarnain, Muhammad
    Namoun, Abdallah
    Khan, Faheem
    IEEE ACCESS, 2024, 12 : 168477 - 168499
  • [34] A Deep Learning-Based Reliability Model for Complex Survival Data
    Aminisharifabad, Mohammad
    Yang, Qingyu
    Wu, Xin
    IEEE TRANSACTIONS ON RELIABILITY, 2021, 70 (01) : 73 - 81
  • [35] A Lightweight Deep Compressive Model for Large-Scale Spike Compression
    Wu, Tong
    Zhao, Wenfeng
    Keefer, Edward
    Yang, Zhi
    2018 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS): ADVANCED SYSTEMS FOR ENHANCING HUMAN HEALTH, 2018, : 207 - 210
  • [36] Deep learning-based ovarian cancer subtypes identification using multi-omics data
    Guo, Long-Yi
    Wu, Ai-Hua
    Wang, Yong-xia
    Zhang, Li-ping
    Chai, Hua
    Liang, Xue-Fang
    BIODATA MINING, 2020, 13 (01)
  • [37] Applications of Deep-Learning in Exploiting Large-Scale and Heterogeneous Compound Data in Industrial Pharmaceutical Research
    David, Laurianne
    Arus-Pous, Josep
    Karlsson, Johan
    Engkvist, Ola
    Bjerrum, Esben Jannik
    Kogej, Thierry
    Kriegl, Jan M.
    Beck, Bernd
    Chen, Hongming
    FRONTIERS IN PHARMACOLOGY, 2019, 10
  • [38] Interpretable deep learning for consistent large-scale urban population estimation using Earth observation data
    Doda, Sugandha
    Kahl, Matthias
    Ouan, Kim
    Obadic, Ivica
    Wang, Yuanyuan
    Taubenboeck, Hannes
    Zhu, Xiao Xiang
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 128
  • [39] Integration of multimodal data for large-scale rapid agricultural land evaluation using machine learning and deep learning approaches
    Li, Liangdan
    Liu, Luo
    Peng, Yiping
    Su, Yingyue
    Hu, Yueming
    Zou, Runyan
    GEODERMA, 2023, 439
  • [40] Large-Scale Machine Learning and Optimization for Bioinformatics Data Analysis
    Cheng, Jianlin
    ACM-BCB 2020 - 11TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2020,