MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis

被引:30
|
作者
Yoo, Seungyeul [1 ,2 ]
Huang, Tao [1 ,2 ]
Campbell, Joshua D. [3 ]
Lee, Eunjee [1 ,2 ]
Tu, Zhidong [1 ,2 ]
Geraci, Mark W. [4 ]
Powell, Charles A. [5 ]
Schadt, Eric E. [1 ,2 ]
Spira, Avrum [3 ]
Zhu, Jun [1 ,2 ]
机构
[1] Icahn Sch Med Mt Sinai, Icahn Inst Genom & Multiscale Biol, New York, NY 10029 USA
[2] Icahn Sch Med Mt Sinai, Dept Genet & Genom Sci, New York, NY 10029 USA
[3] Boston Univ, Sch Med, Dept Med, Div Computat Biomed, Boston, MA 02118 USA
[4] Univ Colorado, Div Pulm Sci & Crit Care Med, Denver, CO 80202 USA
[5] Icahn Sch Med Mt Sinai, Div Pulm Crit Care & Sleep Med, New York, NY 10029 USA
关键词
GENE-EXPRESSION; DNA METHYLATION; MICROARRAY DATA; DISEASE;
D O I
10.1371/journal.pcbi.1003790
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Errors in sample annotation or labeling often occur in large-scale genetic or genomic studies and are difficult to avoid completely during data generation and management. For integrative genomic studies, it is critical to identify and correct these errors. Different types of genetic and genomic data are inter-connected by cis-regulations. On that basis, we developed a computational approach, Multi-Omics Data Matcher (MODMatcher), to identify and correct sample labeling errors in multiple types of molecular data, which can be used in further integrative analysis. Our results indicate that inspection of sample annotation and labeling error is an indispensable data quality assurance step. Applied to a large lung genomic study, MODMatcher increased statistically significant genetic associations and genomic correlations by more than two-fold. In a simulation study, MODMatcher provided more robust results by using three types of omics data than two types of omics data. We further demonstrate that MODMatcher can be broadly applied to large genomic data sets containing multiple types of omics data, such as The Cancer Genome Atlas (TCGA) data sets.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Integrative analysis of multi-omics data for liquid biopsy
    Chen, Geng
    Zhang, Jing
    Fu, Qiaoting
    Taly, Valerie
    Tan, Fei
    BRITISH JOURNAL OF CANCER, 2023, 128 (04) : 702 - 702
  • [2] Evaluation of integrative clustering methods for the analysis of multi-omics data
    Chauvel, Cecile
    Novoloaca, Alexei
    Veyre, Pierre
    Reynier, Frederic
    Becker, Jeremie
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (02) : 541 - 552
  • [3] Comparative analysis of integrative classification methods for multi-omics data
    Novoloaca, Alexei
    Broc, Camilo
    Beloeil, Laurent
    Yu, Wen-Han
    Becker, Jeremie
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (04)
  • [4] Dimension reduction techniques for the integrative analysis of multi-omics data
    Meng, Chen
    Zeleznik, Oana A.
    Thallinger, Gerhard G.
    Kuster, Bernhard
    Gholami, Amin M.
    Culhane, Aedin C.
    BRIEFINGS IN BIOINFORMATICS, 2016, 17 (04) : 628 - 641
  • [5] Correction: Integrative analysis of multi-omics data for liquid biopsy
    Geng Chen
    Jing Zhang
    Qiaoting Fu
    Valerie Taly
    Fei Tan
    British Journal of Cancer, 2023, 128 : 702 - 702
  • [6] Sliced inverse regression for integrative multi-omics data analysis
    Jain, Yashita
    Ding, Shanshan
    Qiu, Jing
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2019, 18 (01)
  • [7] Integrative clustering methods for multi-omics data
    Zhang, Xiaoyu
    Zhou, Zhenwei
    Xu, Hanfei
    Liu, Ching-Ti
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2022, 14 (03)
  • [8] Integrative Sufficient Dimension Reduction Methods for Multi-Omics Data Analysis
    Jain, Yashita
    Ding, Shanshan
    ACM-BCB' 2017: PROCEEDINGS OF THE 8TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY,AND HEALTH INFORMATICS, 2017, : 616 - 616
  • [9] A Customizable Analysis Flow in Integrative Multi-Omics
    Lancaster, Samuel M.
    Sanghi, Akshay
    Wu, Si
    Snyder, Michael P.
    BIOMOLECULES, 2020, 10 (12) : 1 - 15
  • [10] A pan-cancer integrative pathway analysis of multi-omics data
    Henry Linder
    Yuping Zhang
    Quantitative Biology, 2020, 8 (02) : 130 - 142