Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction

被引:6
作者
Liu, Jian [1 ,2 ]
Ge, Shuguang [1 ,2 ]
Cheng, Yuhu [1 ,2 ]
Wang, Xuesong [1 ,2 ]
机构
[1] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou, Jiangsu, Peoples R China
[2] China Univ Min & Technol, Engn Res Ctr Intelligent Control Underground Spac, Minist Educ, Xuzhou, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
multi-view clustering; cancer subtypes prediction; multi-omics data; spectral clustering; smooth representation; graph fusion; HETEROGENEITY; DISCOVERY;
D O I
10.3389/fgene.2021.718915
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
It is a vital task to design an integrated machine learning model to discover cancer subtypes and understand the heterogeneity of cancer based on multiple omics data. In recent years, some multi-view clustering algorithms have been proposed and applied to the prediction of cancer subtypes. Among them, the multi-view clustering methods based on graph learning are widely concerned. These multi-view approaches usually have one or more of the following problems. Many multi-view algorithms use the original omics data matrix to construct the similarity matrix and ignore the learning of the similarity matrix. They separate the data clustering process from the graph learning process, resulting in a highly dependent clustering performance on the predefined graph. In the process of graph fusion, these methods simply take the average value of the affinity graph of multiple views to represent the result of the fusion graph, and the rich heterogeneous information is not fully utilized. To solve the above problems, in this paper, a Multi-view Spectral Clustering Based on Multi-smooth Representation Fusion (MRF-MSC) method was proposed. Firstly, MRF-MSC constructs a smooth representation for each data type, which can be viewed as a sample (patient) similarity matrix. The smooth representation can explicitly enhance the grouping effect. Secondly, MRF-MSC integrates the smooth representation of multiple omics data to form a similarity matrix containing all biological data information through graph fusion. In addition, MRF-MSC adaptively gives weight factors to the smooth regularization representation of each omics data by using the self-weighting method. Finally, MRF-MSC imposes constrained Laplacian rank on the fusion similarity matrix to get a better cluster structure. The above problems can be transformed into spectral clustering for solving, and the clustering results can be obtained. MRF-MSC unifies the above process of graph construction, graph fusion and spectral clustering under one framework, which can learn better data representation and high-quality graphs, so as to achieve better clustering effect. In the experiment, MRF-MSC obtained good experimental results on the TCGA cancer data sets.
引用
收藏
页数:13
相关论文
共 36 条
  • [1] A pan-cancer proteomic perspective on The Cancer Genome Atlas
    Akbani, Rehan
    Ng, Patrick Kwok Shing
    Werner, Henrica M. J.
    Shahmoradgoli, Maria
    Zhang, Fan
    Ju, Zhenlin
    Liu, Wenbin
    Yang, Ji-Yeon
    Yoshihara, Kosuke
    Li, Jun
    Ling, Shiyun
    Seviour, Elena G.
    Ram, Prahlad T.
    Minna, John D.
    Diao, Lixia
    Tong, Pan
    Heymach, John V.
    Hill, Steven M.
    Dondelinger, Frank
    Stadler, Nicolas
    Byers, Lauren A.
    Meric-Bernstam, Funda
    Weinstein, John N.
    Broom, Bradley M.
    Verhaak, Roeland G. W.
    Liang, Han
    Mukherjee, Sach
    Lu, Yiling
    Mills, Gordon B.
    [J]. NATURE COMMUNICATIONS, 2014, 5
  • [2] DriverNet: uncovering the impact of somatic driver mutations on transcriptional networks in cancer
    Bashashati, Ali
    Haffari, Gholamreza
    Ding, Jiarui
    Ha, Gavin
    Lui, Kenneth
    Rosner, Jamie
    Huntsman, David G.
    Caldas, Carlos
    Aparicio, Samuel A.
    Shah, Sohrab P.
    [J]. GENOME BIOLOGY, 2012, 13 (12): : R124
  • [3] Tumour heterogeneity in the clinic
    Bedard, Philippe L.
    Hansen, Aaron R.
    Ratain, Mark J.
    Siu, Lillian L.
    [J]. NATURE, 2013, 501 (7467) : 355 - 364
  • [4] The causes and consequences of genetic heterogeneity in cancer evolution
    Burrell, Rebecca A.
    McGranahan, Nicholas
    Bartek, Jiri
    Swanton, Charles
    [J]. NATURE, 2013, 501 (7467) : 338 - 345
  • [5] Ding C, 2004, LECT NOTES ARTIF INT, V3056, P414
  • [6] miR-93, miR-98, and miR-197 Regulate Expression of Tumor Suppressor Gene FUS1
    Du, Liqin
    Schageman, Jeoffrey J.
    Subauste, Maria C.
    Saber, Barbara
    Hammond, Scott M.
    Prudkin, Ludmila
    Wistuba, Ignacio I.
    Ji, Lin
    Roth, Jack A.
    Minna, John D.
    Pertsemlidis, Alexander
    [J]. MOLECULAR CANCER RESEARCH, 2009, 7 (08) : 1234 - 1243
  • [8] Multi-Omics Data Fusion via a Joint Kernel Learning Model for Cancer Subtype Discovery and Essential Gene Identification
    Feng, Jie
    Jiang, Limin
    Li, Shuhao
    Tang, Jijun
    Wen, Lan
    [J]. FRONTIERS IN GENETICS, 2021, 12
  • [9] Cancer Subtype Recognition Based on Laplacian Rank Constrained Multiview Clustering
    Ge, Shuguang
    Wang, Xuesong
    Cheng, Yuhu
    Liu, Jian
    [J]. GENES, 2021, 12 (04)
  • [10] Goel Manish Kumar, 2010, Int J Ayurveda Res, V1, P274, DOI 10.4103/0974-7788.76794