A deep contrastive multi-modal encoder for multi-omics data integration and analysis

被引:0
|
作者
Yinghua, Ma [1 ]
Khan, Ahmad [1 ]
Heng, Yang [1 ]
Khan, Fiaz Gul [1 ]
Ali, Farman [2 ]
Al-Otaibi, Yasser D. [3 ]
Bashir, Ali Kashif [4 ,5 ]
机构
[1] COMSATS Univ Islamabad, Dept Comp Sci, Abbottabad Campus, Abbottabad 22010, Pakistan
[2] Sungkyunkwan Univ, Coll Comp & Informat, Sch Convergence, Dept Appl AI, Seoul 03063, South Korea
[3] King Abdulaziz Univ, Fac Comp & Informat Technol Rabigh, Dept Informat Syst, Jeddah 21589, Saudi Arabia
[4] Manchester Metropolitan Univ, Dept Comp & Math, Manchester, England
[5] Chitkara Univ, Chitkara Univ Inst Engn & Technol, Ctr Res Impact & Outcome, Rajpura 140401, Punjab, India
关键词
Deep learning; Cancer classification; Clustering; Survival analysis; Multi-omics data; Contrastive learning; Cancer analysis; Dimensionality reduction; ARTIFICIAL-INTELLIGENCE; CANCER SUBTYPES; IDENTIFICATION;
D O I
10.1016/j.ins.2024.121864
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cancer is a highly complex and fatal disease that affects various human organs. Early and accurate cancer analysis is crucial for timely treatment, prognosis, and understanding of the disease's development. Recent research utilizes deep learning-based models to combine multi-omics data for tasks such as cancer classification, clustering, and survival prediction. However, these models often overlook interactions between different types of data, which leads to suboptimal performance. In this paper, we present a Contrastive Multi-Modal Encoder (CMME) that integrates and maps multi-omics data into a lower-dimensional latent space, enabling the model to better understand relationships between different data types. The challenging distribution and organization of the data into anchors, positive samples, and negative samples encourage the model to learn synergies among different modalities, pay attention to both strong and weak modalities, and avoid biased learning. The performance of the proposed model is evaluated on downstream tasks such as clustering, classification, and survival prediction. The CMME achieved an accuracy of 98.16% and an F1 score of 98.09% in classifying breast cancer subtypes. For clustering tasks across ten cancer types based on TCGA data, the adjusted Rand index reached 0.966. Additionally, survival analysis results highlighted significant differences in survival rates between different cancer subtypes. The comprehensive qualitative and quantitative results demonstrate that the proposed method outperforms existing methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Optimizing network propagation for multi-omics data integration
    Charmpi, Konstantina
    Chokkalingam, Manopriya
    Johnen, Ronja
    Beyer, Andreas
    PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (11)
  • [32] ‘Multi-omics’ data integration: applications in probiotics studies
    Iliya Dauda Kwoji
    Olayinka Ayobami Aiyegoro
    Moses Okpeku
    Matthew Adekunle Adeleke
    npj Science of Food, 7
  • [33] Methods for the integration of multi-omics data: mathematical aspects
    Bersanelli, Matteo
    Mosca, Ettore
    Remondini, Daniel
    Giampieri, Enrico
    Sala, Claudia
    Castellani, Gastone
    Milanesi, Luciano
    BMC BIOINFORMATICS, 2016, 17
  • [34] Vertical and horizontal integration of multi-omics data with miodin
    Ulfenborg, Benjamin
    BMC BIOINFORMATICS, 2019, 20 (01)
  • [35] Prospects and challenges of multi-omics data integration in toxicology
    Canzler, Sebastian
    Schor, Jana
    Busch, Wibke
    Schubert, Kristin
    Rolle-Kampczyk, Ulrike E.
    Seitz, Herve
    Kamp, Hennicke
    von Bergen, Martin
    Buesen, Roland
    Hackermueller, Joerg
    ARCHIVES OF TOXICOLOGY, 2020, 94 (02) : 371 - 388
  • [36] Multi-Modal Deep Analysis for Multimedia
    Zhu, Wenwu
    Wang, Xin
    Li, Hongzhi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (10) : 3740 - 3764
  • [37] Connecting Multi-modal Contrastive Representations
    Wang, Zehan
    Zhao, Yang
    Cheng, Xize
    Huang, Haifeng
    Liu, Jiageng
    Tang, Li
    Li, Linjun
    Wang, Yongqi
    Yin, Aoxiong
    Zhang, Ziang
    Zhao, Zhou
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [38] 'Multi-omics' data integration: applications in probiotics studies
    Kwoji, Iliya Dauda
    Aiyegoro, Olayinka Ayobami
    Okpeku, Moses
    Adeleke, Matthew Adekunle
    NPJ SCIENCE OF FOOD, 2023, 7 (01)
  • [39] Vertical and horizontal integration of multi-omics data with miodin
    Benjamin Ulfenborg
    BMC Bioinformatics, 20
  • [40] A Commentary on Multi-omics Data Integration in Systems Vaccinology
    Shannon, Casey P.
    Lee, Amy H. Y.
    Tebbutt, Scott J.
    Singh, Amrit
    JOURNAL OF MOLECULAR BIOLOGY, 2024, 436 (08)