SMRT: Randomized Data Transformation for Cancer Subtyping and Big Data Analysis

被引:4
|
作者
Nguyen, Hung [1 ]
Tran, Duc [1 ]
Tran, Bang [1 ]
Roy, Monikrishna [1 ]
Cassell, Adam [1 ]
Dascalu, Sergiu [1 ]
Draghici, Sorin [2 ]
Nguyen, Tin [1 ]
机构
[1] Univ Nevada, Dept Comp Sci & Engn, Reno, NV 89557 USA
[2] Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA
来源
FRONTIERS IN ONCOLOGY | 2021年 / 11卷
基金
美国国家科学基金会;
关键词
cancer subtyping; multi-omics integration; web application; CRAN package; survival analysis; DISCOVERY; MODULES; GENE; SURVIVAL; TUMORS; JOINT;
D O I
10.3389/fonc.2021.725133
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Cancer is an umbrella term that includes a range of disorders, from those that are fast-growing and lethal to indolent lesions with low or delayed potential for progression to death. The treatment options, as well as treatment success, are highly dependent on the correct subtyping of individual patients. With the advancement of high-throughput platforms, we have the opportunity to differentiate among cancer subtypes from a holistic perspective that takes into consideration phenomena at different molecular levels (mRNA, methylation, etc.). This demands powerful integrative methods to leverage large multi-omics datasets for a better subtyping. Here we introduce Subtyping Multi-omics using a Randomized Transformation (SMRT), a new method for multi-omics integration and cancer subtyping. SMRT offers the following advantages over existing approaches: (i) the scalable analysis pipeline allows researchers to integrate multi-omics data and analyze hundreds of thousands of samples in minutes, (ii) the ability to integrate data types with different numbers of patients, (iii) the ability to analyze un-matched data of different types, and (iv) the ability to offer users a convenient data analysis pipeline through a web application. We also improve the efficiency of our ensemble-based, perturbation clustering to support analysis on machines with memory constraints. In an extensive analysis, we compare SMRT with eight state-of-the-art subtyping methods using 37 TCGA and two METABRIC datasets comprising a total of almost 12,000 patient samples from 28 different types of cancer. We also performed a number of simulation studies. We demonstrate that SMRT outperforms other methods in identifying subtypes with significantly different survival profiles. In addition, SMRT is extremely fast, being able to analyze hundreds of thousands of samples in minutes. The web application is available at http://SMRT.tinnguyen-lab.com. The R package will be deposited to CRAN as part of our PINSPlus software suite.</p>
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Evaluation and comparison of multi-omics data integration methods for cancer subtyping
    Duan, Ran
    Gao, Lin
    Gao, Yong
    Hu, Yuxuan
    Xu, Han
    Huang, Mingfeng
    Song, Kuo
    Wang, Hongda
    Dong, Yongqiang
    Jiang, Chaoqun
    Zhang, Chenxing
    Jia, Songwei
    PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (08)
  • [2] Multi-Omics Data Fusion for Cancer Molecular Subtyping Using Sparse Canonical Correlation Analysis
    Qi, Lin
    Wang, Wei
    Wu, Tan
    Zhu, Lina
    He, Lingli
    Wang, Xin
    FRONTIERS IN GENETICS, 2021, 12
  • [3] UMAP guided topological analysis of transcriptomic data for cancer subtyping
    Rather A.A.
    Chachoo M.A.
    International Journal of Information Technology, 2022, 14 (6) : 2855 - 2865
  • [4] Data-transformation approach to lifetimes data analysis: An overview
    Mudholkar, Govind S.
    Asubonteng, Kobby O.
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2010, 140 (10) : 2904 - 2917
  • [5] MCNF: A Novel Method for Cancer Subtyping by Integrating Multi-Omics and Clinical Data
    Zhao, Lan
    Yan, Hong
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (05) : 1682 - 1690
  • [6] Subtyping children with asthma by clustering analysis of mRNA expression data
    Wang, Ting
    He, Changhui
    Hu, Ming
    Wu, Honghua
    Ou, Shuteng
    Li, Yuke
    Fan, Chuping
    FRONTIERS IN GENETICS, 2022, 13
  • [7] Big Data-Led Cancer Research, Application, and Insights
    Brown, James A. L.
    Chonghaile, Triona Ni
    Matchett, Kyle B.
    Lynam-Lennon, Niamh
    Kiely, Patrick A.
    CANCER RESEARCH, 2016, 76 (21) : 6167 - 6170
  • [8] Robust correlation estimation and UMAP assisted topological analysis of omics data for disease subtyping
    Rather, Arif Ahmad
    Chachoo, Manzoor Ahmad
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 155
  • [9] NEMO: cancer subtyping by integration of partial multi-omic data
    Rappoport, Nimrod
    Shamir, Ron
    BIOINFORMATICS, 2019, 35 (18) : 3348 - 3356
  • [10] Cancer subtyping with heterogeneous multi-omics data via hierarchical multi -kernel learning
    Wei, Yifang
    Li, Lingmei
    Zhao, Xin
    Yang, Haitao
    Sa, Jian
    Cao, Hongyan
    Cui, Yuehua
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)