Performance Comparison of Deep Learning Autoencoders for Cancer Subtype Detection Using Multi-Omics Data

被引:38
|
作者
Franco, Edian F. [1 ,2 ,3 ]
Rana, Pratip [4 ]
Cruz, Aline [5 ]
Calderon, Victor V. [3 ]
Azevedo, Vasco [6 ]
Ramos, Rommel T. J. [3 ]
Ghosh, Preetam [4 ]
机构
[1] Fed Univ Para, Inst Biol Sci, BR-66075110 Belem, PA, Brazil
[2] Inst Innovac Biotecnol & Ind IIBI, Lab Virol & Environm Genom, Santo Domingo 10104, Dominican Rep
[3] Inst Tecnol Santo Domingo INTEC, Santo Domingo 10602, Dominican Rep
[4] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
[5] Fed Univ Para, Programa Posgrad Enfermagem, BR-66075110 Belem, PA, Brazil
[6] Univ Fed Minas Gerais, Inst Biol Sci, BR-31270901 Belo Horizonte, MG, Brazil
关键词
cancer subtype detection; multi-omics data; data integration; autoencoder; survival analysis; IDENTIFICATION; SELECTION; PACKAGE;
D O I
10.3390/cancers13092013
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Simple Summary Here, we compared the performance of four different autoencoders: (a) vanilla, (b) sparse, (c) denoising, and (d) variational for subtype detection on four cancer types: Glioblastoma multiforme, Colon Adenocarcinoma, Kidney renal clear cell carcinoma, and Breast invasive carcinoma. Multiview dataset comprising gene expression, DNA methylation, and miRNA expression from TCGA is fed into an autoencoder to get a compressed nonlinear representation. Then the clustering technique was applied on that compressed representation to reveal the subtype of cancer. Though different autoencoders' performance varies on different datasets, they performed much better than standard data fusion techniques such as PCA, kernel PCA, and sparse PCA. A heterogeneous disease such as cancer is activated through multiple pathways and different perturbations. Depending upon the activated pathway(s), the survival of the patients varies significantly and shows different efficacy to various drugs. Therefore, cancer subtype detection using genomics level data is a significant research problem. Subtype detection is often a complex problem, and in most cases, needs multi-omics data fusion to achieve accurate subtyping. Different data fusion and subtyping approaches have been proposed over the years, such as kernel-based fusion, matrix factorization, and deep learning autoencoders. In this paper, we compared the performance of different deep learning autoencoders for cancer subtype detection. We performed cancer subtype detection on four different cancer types from The Cancer Genome Atlas (TCGA) datasets using four autoencoder implementations. We also predicted the optimal number of subtypes in a cancer type using the silhouette score and found that the detected subtypes exhibit significant differences in survival profiles. Furthermore, we compared the effect of feature selection and similarity measures for subtype detection. For further evaluation, we used the Glioblastoma multiforme (GBM) dataset and identified the differentially expressed genes in each of the subtypes. The results obtained are consistent with other genomic studies and can be corroborated with the involved pathways and biological functions. Thus, it shows that the results from the autoencoders, obtained through the interaction of different datatypes of cancer, can be used for the prediction and characterization of patient subgroups and survival profiles.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Integrating multi-omics data through deep learning for accurate cancer prognosis prediction
    Chai, Hua
    Zhou, Xiang
    Zhang, Zhongyue
    Rao, Jiahua
    Zhao, Huiying
    Yang, Yuedong
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 134
  • [22] MOFNet: A Deep Learning Framework of Integrating Multi-omics Data for Breast Cancer Diagnosis
    Zhang, Chunxiao
    Li, Pengpai
    Sun, Duanchen
    Liu, Zhi-Ping
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT III, 2023, 14088 : 727 - 738
  • [23] MoVAE: Multi-Omics Variational Auto-Encoder for Cancer Subtype Detection
    Rahmanian, Mohsen
    Mansoori, Eghbal G.
    IEEE ACCESS, 2024, 12 : 133617 - 133631
  • [24] Unsupervised classification of multi-omics data during cardiac remodeling using deep learning
    Chung, Neo Christopher
    Mirza, Bilal
    Choi, Howard
    Wang, Jie
    Wang, Ding
    Ping, Peipei
    Wang, Wei
    METHODS, 2019, 166 : 66 - 73
  • [25] MDICC: novel method for multi-omics data integration and cancer subtype identification
    Yang, Ying
    Tian, Sha
    Qiu, Yushan
    Zhao, Pu
    Zou, Quan
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (03)
  • [26] Cancer Molecular Subtype Classification by Graph Convolutional Networks on Multi-omics Data
    Li, Bingjun
    Wang, Tianyu
    Nabavi, Sheida
    12TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS (ACM-BCB 2021), 2021,
  • [27] A hierarchical integration deep flexible neural forest framework for cancer subtype classification by integrating multi-omics data
    Jing Xu
    Peng Wu
    Yuehui Chen
    Qingfang Meng
    Hussain Dawood
    Hassan Dawood
    BMC Bioinformatics, 20
  • [28] A hierarchical integration deep flexible neural forest framework for cancer subtype classification by integrating multi-omics data
    Xu, Jing
    Wu, Peng
    Chen, Yuehui
    Meng, Qingfang
    Dawood, Hussain
    Dawood, Hassan
    BMC BIOINFORMATICS, 2019, 20 (01)
  • [29] Deep Learning for Integrated Analysis of Insulin Resistance with Multi-Omics Data
    Huang, Eunchong
    Kim, Sarah
    Ahn, TaeJin
    JOURNAL OF PERSONALIZED MEDICINE, 2021, 11 (02): : 1 - 14
  • [30] Classifying the multi-omics data of gastric cancer using a deep feature selection method
    Hu, Yanyu
    Zhao, Long
    Li, Zhao
    Dong, Xiangjun
    Xu, Tiantian
    Zhao, Yuhai
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 200