Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data

被引:77
|
作者
Chen, Runpu [1 ]
Yang, Le [1 ]
Goodison, Steve [2 ]
Sun, Yijun [1 ,3 ,4 ]
机构
[1] SUNY Buffalo, Dept Comp Sci & Engn, Buffalo, NY 14214 USA
[2] Mayo Clin, Dept Hlth Sci Res, Jacksonville, FL 32224 USA
[3] SUNY Buffalo, Dept Microbiol & Immunol, Buffalo, NY 14214 USA
[4] SUNY Buffalo, Dept Biostat, Buffalo, NY 14214 USA
关键词
BREAST-CANCER; MODEL; DISCOVERY; CLUSTERS;
D O I
10.1093/bioinformatics/btz769
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Cancer subtype classification has the potential to significantly improve disease prognosis and develop individualized patient management. Existing methods are limited by their ability to handle extremely high-dimensional data and by the influence of misleading, irrelevant factors, resulting in ambiguous and overlapping subtypes. Results: To address the above issues, we proposed a novel approach to disentangling and eliminating irrelevant factors by leveraging the power of deep learning. Specifically, we designed a deep-learning framework, referred to as DeepType, that performs joint supervised classification, unsupervised clustering and dimensionality reduction to learn cancer-relevant data representation with cluster structure. We applied DeepType to the METABRIC breast cancer dataset and compared its performance to state-of-the-art methods. DeepType significantly outperformed the existing methods, identifying more robust subtypes while using fewer genes. The new approach provides a framework for the derivation of more accurate and robust molecular cancer subtypes by using increasingly complex, multi-source data.
引用
收藏
页码:1476 / 1483
页数:8
相关论文
共 50 条
  • [1] Deep learning approach for cancer subtype classification using high-dimensional gene expression data
    Jiquan Shen
    Jiawei Shi
    Junwei Luo
    Haixia Zhai
    Xiaoyan Liu
    Zhengjiang Wu
    Chaokun Yan
    Huimin Luo
    BMC Bioinformatics, 23
  • [2] Deep learning approach for cancer subtype classification using high-dimensional gene expression data
    Shen, Jiquan
    Shi, Jiawei
    Luo, Junwei
    Zhai, Haixia
    Liu, Xiaoyan
    Wu, Zhengjiang
    Yan, Chaokun
    Luo, Huimin
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [3] Classification and deep-learning–based prediction of Alzheimer disease subtypes by using genomic data
    Daichi Shigemizu
    Shintaro Akiyama
    Mutsumi Suganuma
    Motoki Furutani
    Akiko Yamakawa
    Yukiko Nakano
    Kouichi Ozaki
    Shumpei Niida
    Translational Psychiatry, 13
  • [4] ForestSubtype: a cancer subtype identifying approach based on high-dimensional genomic data and a parallel random forest
    Junwei Luo
    Yading Feng
    Xuyang Wu
    Ruimin Li
    Jiawei Shi
    Wenjing Chang
    Junfeng Wang
    BMC Bioinformatics, 24
  • [5] ForestSubtype: a cancer subtype identifying approach based on high-dimensional genomic data and a parallel random forest
    Luo, Junwei
    Feng, Yading
    Wu, Xuyang
    Li, Ruimin
    Shi, Jiawei
    Chang, Wenjing
    Wang, Junfeng
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [6] Deep learning with evolutionary and genomic profiles for identifying cancer subtypes
    Lin, Chun-Yu
    Ruan, Peiying
    Li, Ruiming
    Yang, Jinn-Moon
    See, Simon
    Song, Jiangning
    Akutsu, Tatsuya
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2019, 17 (03)
  • [7] Deep Learning with Evolutionary and Genomic Profiles for Identifying Cancer Subtypes
    Lin, Chun-Yu
    Li, Ruiming
    Akutsu, Tatsuya
    Ruan, Peiying
    See, Simon
    Yang, Jinn-Moon
    PROCEEDINGS 2018 IEEE 18TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2018, : 147 - 150
  • [8] A Deep-Learning Approach for the Identification of New Subtypes of Lung Cancer
    Banerjee, Tuhin
    Corradini, Andrea
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2024, PT I, 2025, 15346 : 349 - 360
  • [9] Identifying redundant features using unsupervised learning for high-dimensional data
    Danasingh, Asir Antony Gnana Singh
    Subramanian, Appavu alias Balamurugan
    Epiphany, Jebamalar Leavline
    SN APPLIED SCIENCES, 2020, 2 (08):
  • [10] Identifying redundant features using unsupervised learning for high-dimensional data
    Asir Antony Gnana Singh Danasingh
    Appavu alias Balamurugan Subramanian
    Jebamalar Leavline Epiphany
    SN Applied Sciences, 2020, 2