A deep contrastive multi-modal encoder for multi-omics data integration and analysis

被引:0
|
作者
Yinghua, Ma [1 ]
Khan, Ahmad [1 ]
Heng, Yang [1 ]
Khan, Fiaz Gul [1 ]
Ali, Farman [2 ]
Al-Otaibi, Yasser D. [3 ]
Bashir, Ali Kashif [4 ,5 ]
机构
[1] COMSATS Univ Islamabad, Dept Comp Sci, Abbottabad Campus, Abbottabad 22010, Pakistan
[2] Sungkyunkwan Univ, Coll Comp & Informat, Sch Convergence, Dept Appl AI, Seoul 03063, South Korea
[3] King Abdulaziz Univ, Fac Comp & Informat Technol Rabigh, Dept Informat Syst, Jeddah 21589, Saudi Arabia
[4] Manchester Metropolitan Univ, Dept Comp & Math, Manchester, England
[5] Chitkara Univ, Chitkara Univ Inst Engn & Technol, Ctr Res Impact & Outcome, Rajpura 140401, Punjab, India
关键词
Deep learning; Cancer classification; Clustering; Survival analysis; Multi-omics data; Contrastive learning; Cancer analysis; Dimensionality reduction; ARTIFICIAL-INTELLIGENCE; CANCER SUBTYPES; IDENTIFICATION;
D O I
10.1016/j.ins.2024.121864
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cancer is a highly complex and fatal disease that affects various human organs. Early and accurate cancer analysis is crucial for timely treatment, prognosis, and understanding of the disease's development. Recent research utilizes deep learning-based models to combine multi-omics data for tasks such as cancer classification, clustering, and survival prediction. However, these models often overlook interactions between different types of data, which leads to suboptimal performance. In this paper, we present a Contrastive Multi-Modal Encoder (CMME) that integrates and maps multi-omics data into a lower-dimensional latent space, enabling the model to better understand relationships between different data types. The challenging distribution and organization of the data into anchors, positive samples, and negative samples encourage the model to learn synergies among different modalities, pay attention to both strong and weak modalities, and avoid biased learning. The performance of the proposed model is evaluated on downstream tasks such as clustering, classification, and survival prediction. The CMME achieved an accuracy of 98.16% and an F1 score of 98.09% in classifying breast cancer subtypes. For clustering tasks across ten cancer types based on TCGA data, the adjusted Rand index reached 0.966. Additionally, survival analysis results highlighted significant differences in survival rates between different cancer subtypes. The comprehensive qualitative and quantitative results demonstrate that the proposed method outperforms existing methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Visual analysis of multi-omics data
    Swart, Austin
    Caspi, Ron
    Paley, Suzanne
    Karp, Peter D.
    FRONTIERS IN BIOINFORMATICS, 2024, 4
  • [22] Integration of multi-omics data and deep phenotyping enables prediction of cytokine responses
    Olivier B. Bakker
    Raul Aguirre-Gamboa
    Serena Sanna
    Marije Oosting
    Sanne P. Smeekens
    Martin Jaeger
    Maria Zorro
    Urmo Võsa
    Sebo Withoff
    Romana T. Netea-Maier
    Hans J. P. M. Koenen
    Irma Joosten
    Ramnik J. Xavier
    Lude Franke
    Leo A. B. Joosten
    Vinod Kumar
    Cisca Wijmenga
    Mihai G. Netea
    Yang Li
    Nature Immunology, 2018, 19 : 776 - 786
  • [23] Reducing health disparities for prostate adenocarcinoma by integrating multi-omics data via a multi-modal transfer learning approach
    Li, Lusheng
    Wang, Jieqiong
    Wan, Shibiao
    CANCER RESEARCH, 2024, 84 (06)
  • [24] Multi-omics data integration approaches for precision oncology
    Correa-Aguila, Raidel
    Alonso-Pupo, Niuxia
    Hernandez-Rodriguez, Erix W.
    MOLECULAR OMICS, 2022, 18 (06) : 469 - 479
  • [25] Integration of Multi-Omics Data to Identify Cancer Biomarkers
    Li, Peng
    Sun, Bo
    JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2022, 15 (01)
  • [26] Multi-omics Data Integration, Interpretation, and Its Application
    Subramanian, Indhupriya
    Verma, Srikant
    Kumar, Shiva
    Jere, Abhay
    Anamika, Krishanpal
    BIOINFORMATICS AND BIOLOGY INSIGHTS, 2020, 14
  • [27] Multi-omics data integration by generative adversarial network
    Ahmed, Khandakar Tanvir
    Sun, Jiao
    Cheng, Sze
    Yong, Jeongsik
    Zhang, Wei
    BIOINFORMATICS, 2022, 38 (01) : 179 - 186
  • [28] A survey on data integration for multi-omics sample clustering
    Lovino, Marta
    Randazzo, Vincenzo
    Ciravegna, Gabriele
    Barbiero, Pietro
    Ficarra, Elisa
    Cirrincione, Giansalvo
    NEUROCOMPUTING, 2022, 488 : 494 - 508
  • [29] Prospects and challenges of multi-omics data integration in toxicology
    Sebastian Canzler
    Jana Schor
    Wibke Busch
    Kristin Schubert
    Ulrike E. Rolle-Kampczyk
    Hervé Seitz
    Hennicke Kamp
    Martin von Bergen
    Roland Buesen
    Jörg Hackermüller
    Archives of Toxicology, 2020, 94 : 371 - 388
  • [30] Methods for the integration of multi-omics data: mathematical aspects
    Matteo Bersanelli
    Ettore Mosca
    Daniel Remondini
    Enrico Giampieri
    Claudia Sala
    Gastone Castellani
    Luciano Milanesi
    BMC Bioinformatics, 17