A multi-omics data simulator for complex disease studies and its application to evaluate multi-omics data analysis methods for disease classification

被引:32
|
作者
Chung, Ren-Hua [1 ]
Kang, Chen-Yu [1 ]
机构
[1] Natl Hlth Res Inst, Inst Populat Hlth Sci, Div Biostat & Bioinformat, 35 Keyan Rd, Zhunan 350, Taiwan
来源
GIGASCIENCE | 2019年 / 8卷 / 05期
关键词
multi-omics data; complex disease study; simulation tool; SEQUENCING DATA; TRAITS; TOOL;
D O I
10.1093/gigascience/giz045
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: An integrative multi-omics analysis approach that combines multiple types of omics data including genomics, epigenomics, transcriptomics, proteomics, metabolomics, and microbiomics has become increasing popular for understanding the pathophysiology of complex diseases. Although many multi-omics analysis methods have been developed for complex disease studies, only a few simulation tools that simulate multiple types of omics data and model their relationships with disease status are available, and these tools have their limitations in simulating the multi-omics data. Results: We developed the multi-omics data simulator OmicsSIMLA, which simulates genomics (i.e., single-nucleotide polymorphisms [SNPs] and copy number variations), epigenomics (i.e., bisulphite sequencing), transcriptomics (i.e., RNA sequencing), and proteomics (i.e., normalized reverse phase protein array) data at the whole-genome level. Furthermore, the relationships between different types of omics data, such as methylation quantitative trait loci (SNPs influencing methylation), expression quantitative trait loci (SNPs influencing gene expression), and expression quantitative trait methylations (methylations influencing gene expression), were modeled. More importantly, the relationships between these multi-omics data and the disease status were modeled as well. We used OmicsSIMLA to simulate a multi-omics dataset for breast cancer under a hypothetical disease model and used the data to compare the performance among existing multi-omics analysis methods in terms of disease classification accuracy and runtime. We also used OmicsSIMLA to simulate a multi-omics dataset with a scale similar to an ovarian cancer multi-omics dataset. The neural network-based multi-omics analysis method ATHENA was applied to both the real and simulated data and the results were compared. Our results demonstrated that complex disease mechanisms can be simulated by OmicsSIMLA, and ATHENA showed the highest prediction accuracy when the effects of multi-omics features (e.g., SNPs, copy number variations, and gene expression levels) on the disease were strong. Furthermore, similar results can be obtained from ATHENA when analyzing the simulated and real ovarian multi-omics data. Conclusions: OmicsSIMLA will be useful to evaluate the performace of different multi-omics analysis methods. Sample sizes and power can also be calculated by OmicsSIMLA when planning a new multi-omics disease study.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] ‘Multi-omics’ data integration: applications in probiotics studies
    Iliya Dauda Kwoji
    Olayinka Ayobami Aiyegoro
    Moses Okpeku
    Matthew Adekunle Adeleke
    npj Science of Food, 7
  • [22] 'Multi-omics' data integration: applications in probiotics studies
    Kwoji, Iliya Dauda
    Aiyegoro, Olayinka Ayobami
    Okpeku, Moses
    Adeleke, Matthew Adekunle
    NPJ SCIENCE OF FOOD, 2023, 7 (01)
  • [23] Progress of bioinformatics studies for multi-omics and multi- modal data in complex diseases
    Liu, Xiaofan
    Lu, Zhi John
    CHINESE SCIENCE BULLETIN-CHINESE, 2024, 69 (30): : 4432 - 4446
  • [24] Integrative Sufficient Dimension Reduction Methods for Multi-Omics Data Analysis
    Jain, Yashita
    Ding, Shanshan
    ACM-BCB' 2017: PROCEEDINGS OF THE 8TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY,AND HEALTH INFORMATICS, 2017, : 616 - 616
  • [25] The Omics Dashboard for Interactive Exploration of Metabolomics and Multi-Omics Data
    Paley, Suzanne
    Karp, Peter D.
    METABOLITES, 2024, 14 (01)
  • [26] Integrative analysis of multi-omics data for liquid biopsy
    Chen, Geng
    Zhang, Jing
    Fu, Qiaoting
    Taly, Valerie
    Tan, Fei
    BRITISH JOURNAL OF CANCER, 2023, 128 (04) : 702 - 702
  • [27] Integrative Analysis of Multi-omics Data for Discovery and Functional Studies of Complex Human Diseases
    Sun, Yan V.
    Hu, Yi-Juan
    ADVANCES IN GENETICS, VOL 93, 2016, 93 : 147 - 190
  • [28] Stability of Feature Selection in Multi-Omics Data Analysis
    Lukaszuk, Tomasz
    Krawczuk, Jerzy
    Zyla, Kamil
    Kesik, Jacek
    APPLIED SCIENCES-BASEL, 2024, 14 (23):
  • [29] DESIGNS AND ANALYTICAL METHODS IN MULTI-OMICS STUDIES OF COMPLEX DISEASES
    Sun, Yan
    EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2019, 29 : S743 - S744
  • [30] METABOLOMICS & MULTI-OMICS ANALYSIS OF CROHN'S DISEASE
    Frau, Alessandra
    Hough, Rachael
    Ijaz, Umer
    Campbell, Barry
    Kenny, John
    Hall, Neil
    Anson, Jim
    Darby, Alistair
    Probert, Chris
    GUT, 2019, 68 : A68 - A68