Supervised multiple kernel learning approaches for multi-omics data integration

被引:0
作者
Briscik, Mitja [1 ]
Tazza, Gabriele [2 ]
Vidacs, Laszlo [2 ]
Dillies, Marie-Agnes [3 ]
Dejean, Sebastien [1 ]
机构
[1] Univ Toulouse, Inst Math Toulouse, CNRS, UPS ,UMR5219, F-31062 Toulouse 9, France
[2] Univ Szeged, Dept Comp Sci, Appl Artificial Intelligence Grp, H-6720 Szeged, Hungary
[3] Univ Paris Cite, Inst Pasteur, Bioinformat & Biostat Hub, F-75015 Paris, France
来源
BIODATA MINING | 2024年 / 17卷 / 01期
关键词
Multi-omics; Data integration; Kernel methods; Deep learning; Data mining; Biomarker; BREAST-CANCER CELLS; II GENE; DISEASE; TUMOR; RELAXATION; NETWORKS;
D O I
10.1186/s13040-024-00406-9
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
BackgroundAdvances in high-throughput technologies have originated an ever-increasing availability of omics datasets. The integration of multiple heterogeneous data sources is currently an issue for biology and bioinformatics. Multiple kernel learning (MKL) has shown to be a flexible and valid approach to consider the diverse nature of multi-omics inputs, despite being an underused tool in genomic data mining.ResultsWe provide novel MKL approaches based on different kernel fusion strategies. To learn from the meta-kernel of input kernels, we adapted unsupervised integration algorithms for supervised tasks with support vector machines. We also tested deep learning architectures for kernel fusion and classification. The results show that MKL-based models can outperform more complex, state-of-the-art, supervised multi-omics integrative approaches.ConclusionMultiple kernel learning offers a natural framework for predictive models in multi-omics data. It proved to provide a fast and reliable solution that can compete with and outperform more complex architectures. Our results offer a direction for bio-data mining research, biomarker discovery and further development of methods for heterogeneous data integration.
引用
收藏
页数:25
相关论文
共 79 条
[1]  
Alpaydin E., 2008, Proceedings of the 25th International Conference on Machine Learning, P352, DOI [10.1145/1390156.1390201, DOI 10.1145/1390156.1390201]
[2]   Kernel independent component analysis [J].
Bach, FR ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (01) :1-48
[3]   The amino acid transporter SLC6A14 in cancer and its potential use in chemotherapy [J].
Bhutia, Yangzom D. ;
Babu, Ellappan ;
Prasad, Puttur D. ;
Ganapathy, Vadivel .
ASIAN JOURNAL OF PHARMACEUTICAL SCIENCES, 2014, 9 (06) :293-303
[4]  
Bica I, 2018, EUROPEAN S ARTIFICIA
[5]   High affinity choline transporter status in Alzheimer's disease tissue from rapid autopsy [J].
Bissette, G ;
Seidler, FJ ;
Nemeroff, CB ;
Slotkin, TA .
NEUROBIOLOGY OF ALZHEIMER'S DISEASE, 1996, 777 :197-204
[6]  
Borisov V, 2022, Arxiv, DOI [arXiv:2110.01889, DOI 10.48550/ARXIV.2110.01889]
[7]  
Briscik M, 2024, kpcaIG: Variables Interpretability with Kernel PCA, DOI [10.32614/cran.package.kpcaig, DOI 10.32614/CRAN.PACKAGE.KPCAIG]
[8]   Improvement of variables interpretability in kernel PCA [J].
Briscik, Mitja ;
Dillies, Marie-Agnes ;
Dejean, Sebastien .
BMC BIOINFORMATICS, 2023, 24 (01)
[9]   Should we really use graph neural networks for transcriptomic prediction? [J].
Brouard, Celine ;
Mourad, Raphael ;
Vialaneix, Nathalie .
BRIEFINGS IN BIOINFORMATICS, 2024, 25 (02)
[10]   ToppGene Suite for gene list enrichment analysis and candidate gene prioritization [J].
Chen, Jing ;
Bardes, Eric E. ;
Aronow, Bruce J. ;
Jegga, Anil G. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W305-W311