Amogel: a multi-omics classification framework using associative graph neural networks with prior knowledge for biomarker identification

被引:0
作者
Tan, Chia Yan [1 ]
Ong, Huey Fang [1 ]
Lim, Chern Hong [1 ]
Tan, Mei Sze [1 ]
Ooi, Ean Hin [2 ]
Wong, Koksheik [1 ]
机构
[1] Monash Univ Malaysia, Sch Informat Technol, Petaling Jaya 47500, Selangor, Malaysia
[2] Monash Univ Malaysia, Sch Engn, Petaling Jaya 47500, Selangor, Malaysia
来源
BMC BIOINFORMATICS | 2025年 / 26卷 / 01期
关键词
Graph neural network; Association rule mining; Graph classification; Multi-omics; Prior knowledge; GENE-EXPRESSION; SURVIVAL;
D O I
10.1186/s12859-025-06111-6
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The advent of high-throughput sequencing technologies, such as DNA microarray and DNA sequencing, has enabled effective analysis of cancer subtypes and targeted treatment. Furthermore, numerous studies have highlighted the capability of graph neural networks (GNN) to model complex biological systems and capture non-linear interactions in high-throughput data. GNN has proven to be useful in leveraging multiple types of omics data, including prior biological knowledge from various sources, such as transcriptomics, genomics, proteomics, and metabolomics, to improve cancer classification. However, current works do not fully utilize the non-linear learning potential of GNN and lack of the integration ability to analyse high-throughput multi-omics data simultaneously with prior biological knowledge. Nevertheless, relying on limited prior knowledge in generating gene graphs might lead to less accurate classification due to undiscovered significant gene-gene interactions, which may require expert intervention and can be time-consuming. Hence, this study proposes a graph classification model called associative multi-omics graph embedding learning (AMOGEL) to effectively integrate multi-omics datasets and prior knowledge through GNN coupled with association rule mining (ARM). AMOGEL employs an early fusion technique using ARM to mine intra-omics and inter-omics relationships, forming a multi-omics synthetic information graph before the model training. Moreover, AMOGEL introduces multi-dimensional edges, with multi-omics gene associations or edges as the main contributors and prior knowledge edges as auxiliary contributors. Additionally, it uses a gene ranking technique based on attention scores, considering the relationships between neighbouring genes. Several experiments were performed on BRCA and KIPAN cancer subtypes to demonstrate the integration of multi-omics datasets (miRNA, mRNA, and DNA methylation) with prior biological knowledge of protein-protein interactions, KEGG pathways and Gene Ontology. The experimental results showed that the AMOGEL outperformed the current state-of-the-art models in terms of classification accuracy, F1 score and AUC score. The findings of this study represent a crucial step forward in advancing the effective integration of multi-omics data and prior knowledge to improve cancer subtype classification.
引用
收藏
页数:27
相关论文
共 49 条
  • [1] Breast and Colon Cancer Classification from Gene Expression Profiles Using Data Mining Techniques
    AbdElNabi, Mohamed Loey Ramadan
    Jasim, Mohammed Wajeeh
    EL-Bakry, Hazem M.
    Taha, Mohamed Hamed N.
    Khalifa, Nour Eldeen M.
    [J]. SYMMETRY-BASEL, 2020, 12 (03):
  • [2] Alagukumar S, 2016, 2016 INTERNATIONAL CONFERENCE ON COMPUTING TECHNOLOGIES AND INTELLIGENT DATA ENGINEERING (ICCTIDE'16)
  • [3] Bellman R. E., 2015, Adaptive Control Processes: A Guided Tour, DOI DOI 10.1515/9781400874668
  • [4] Bing Liu, 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P80
  • [5] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    [J]. Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75
  • [6] Borgelt C., 2014, Proceedings of the 14th International Conference on Extending Database Technology - EDBT/ICDT11, P367, DOI DOI 10.1145/1951365.1951410
  • [7] GATA3 in Development and Cancer Differentiation: Cells GATA Have It!
    Chou, Jonathan
    Provot, Sylvain
    Werb, Zena
    [J]. JOURNAL OF CELLULAR PHYSIOLOGY, 2010, 222 (01) : 42 - 49
  • [8] Clinical characteristics and patient outcomes of molecular subtypes of small cell lung cancer (SCLC)
    Ding, Xiao-Long
    Su, Yi-Ge
    Yu, Liang
    Bai, Zhou-Lan
    Bai, Xue-Hong
    Chen, Xiao-Zhen
    Yang, Xia
    Zhao, Ren
    He, Jin-Xi
    Wang, Yan-Yang
    [J]. WORLD JOURNAL OF SURGICAL ONCOLOGY, 2022, 20 (01)
  • [9] Gene Expression Omnibus: NCBI gene expression and hybridization array data repository
    Edgar, R
    Domrachev, M
    Lash, AE
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 207 - 210
  • [10] MicroRNA-122 in human cancers: from mechanistic to clinical perspectives
    Faramin Lashkarian, Mahboobeh
    Hashemipour, Nasrin
    Niaraki, Negin
    Soghala, Shahrad
    Moradi, Ali
    Sarhangi, Sareh
    Hatami, Mahsa
    Aghaei-Zarch, Fatemehsadat
    Khosravifar, Mina
    Mohammadzadeh, Alireza
    Najafi, Sajad
    Majidpoor, Jamal
    Farnia, Poopak
    Aghaei-Zarch, Seyed Mohsen
    [J]. CANCER CELL INTERNATIONAL, 2023, 23 (01)