Performance Comparison of Deep Learning Autoencoders for Cancer Subtype Detection Using Multi-Omics Data

被引:39
作者
Franco, Edian F. [1 ,2 ,3 ]
Rana, Pratip [4 ]
Cruz, Aline [5 ]
Calderon, Victor V. [3 ]
Azevedo, Vasco [6 ]
Ramos, Rommel T. J. [3 ]
Ghosh, Preetam [4 ]
机构
[1] Fed Univ Para, Inst Biol Sci, BR-66075110 Belem, PA, Brazil
[2] Inst Innovac Biotecnol & Ind IIBI, Lab Virol & Environm Genom, Santo Domingo 10104, Dominican Rep
[3] Inst Tecnol Santo Domingo INTEC, Santo Domingo 10602, Dominican Rep
[4] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
[5] Fed Univ Para, Programa Posgrad Enfermagem, BR-66075110 Belem, PA, Brazil
[6] Univ Fed Minas Gerais, Inst Biol Sci, BR-31270901 Belo Horizonte, MG, Brazil
关键词
cancer subtype detection; multi-omics data; data integration; autoencoder; survival analysis; IDENTIFICATION; SELECTION; PACKAGE;
D O I
10.3390/cancers13092013
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Simple Summary Here, we compared the performance of four different autoencoders: (a) vanilla, (b) sparse, (c) denoising, and (d) variational for subtype detection on four cancer types: Glioblastoma multiforme, Colon Adenocarcinoma, Kidney renal clear cell carcinoma, and Breast invasive carcinoma. Multiview dataset comprising gene expression, DNA methylation, and miRNA expression from TCGA is fed into an autoencoder to get a compressed nonlinear representation. Then the clustering technique was applied on that compressed representation to reveal the subtype of cancer. Though different autoencoders' performance varies on different datasets, they performed much better than standard data fusion techniques such as PCA, kernel PCA, and sparse PCA. A heterogeneous disease such as cancer is activated through multiple pathways and different perturbations. Depending upon the activated pathway(s), the survival of the patients varies significantly and shows different efficacy to various drugs. Therefore, cancer subtype detection using genomics level data is a significant research problem. Subtype detection is often a complex problem, and in most cases, needs multi-omics data fusion to achieve accurate subtyping. Different data fusion and subtyping approaches have been proposed over the years, such as kernel-based fusion, matrix factorization, and deep learning autoencoders. In this paper, we compared the performance of different deep learning autoencoders for cancer subtype detection. We performed cancer subtype detection on four different cancer types from The Cancer Genome Atlas (TCGA) datasets using four autoencoder implementations. We also predicted the optimal number of subtypes in a cancer type using the silhouette score and found that the detected subtypes exhibit significant differences in survival profiles. Furthermore, we compared the effect of feature selection and similarity measures for subtype detection. For further evaluation, we used the Glioblastoma multiforme (GBM) dataset and identified the differentially expressed genes in each of the subtypes. The results obtained are consistent with other genomic studies and can be corroborated with the involved pathways and biological functions. Thus, it shows that the results from the autoencoders, obtained through the interaction of different datatypes of cancer, can be used for the prediction and characterization of patient subgroups and survival profiles.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Integration of incomplete multi-omics data using Knowledge Distillation and Supervised Variational Autoencoders for disease progression prediction
    Ranjbari, Sima
    Arslanturk, Suzan
    JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 147
  • [42] Deep latent space fusion for adaptive representation of heterogeneous multi-omics data
    Zhang, Chengming
    Chen, Yabin
    Zeng, Tao
    Zhang, Chuanchao
    Chen, Luonan
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (02)
  • [43] Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis
    Tong, Li
    Mitchel, Jonathan
    Chatlin, Kevin
    Wang, May D.
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2020, 20 (01)
  • [44] Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis
    Li Tong
    Jonathan Mitchel
    Kevin Chatlin
    May D. Wang
    BMC Medical Informatics and Decision Making, 20
  • [45] Functional Analysis of Molecular Subtypes with Deep Similarity Learning Model Based on Multi-omics Data
    Liu, Shuhui
    Zhang Yupei
    Shang, Xuequn
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2022, PT II, 2022, 13394 : 126 - 137
  • [46] DeepRCI: predicting RNA-chromatin interactions via deep learning with multi-omics data
    Xiong, Yuanpeng
    He, Xuan
    Zhao, Dan
    Jiang, Tao
    Zeng, Jianyang
    QUANTITATIVE BIOLOGY, 2023, 11 (03) : 275 - 286
  • [47] Benchmarking multi-omics integrative clustering methods for subtype identification in colorectal cancer
    Zhang, Shuai
    Lv, Jiali
    Zhang, Jinglan
    Fan, Zhe
    Gu, Bingbing
    Fan, Bingbing
    Li, Chunxia
    Wang, Cheng
    Zhang, Tao
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2025, 261
  • [48] Multiview clustering of multi-omics data integration by using a penalty model
    Hamas A. AL-kuhali
    Ma Shan
    Mohanned Abduljabbar Hael
    Eman A. Al-Hada
    Shamsan A. Al-Murisi
    Ahmed A. Al-kuhali
    Ammar A. Q. Aldaifl
    Mohammed Elmustafa Amin
    BMC Bioinformatics, 23
  • [49] Bayesian tensor factorization-drive breast cancer subtyping by integrating multi-omics data
    Liu, Qian
    Cheng, Bowen
    Jin, Yongwon
    Hu, Pingzhao
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 125
  • [50] Multiview clustering of multi-omics data integration by using a penalty model
    AL-kuhali, Hamas A.
    Shan, Ma
    Hael, Mohanned Abduljabbar
    Al-Hada, Eman A.
    Al-Murisi, Shamsan A.
    Al-kuhali, Ahmed A.
    Aldaifl, Ammar A. Q.
    Amin, Mohammed Elmustafa
    BMC BIOINFORMATICS, 2022, 23 (01)