Gene-set distance analysis (GSDA): a powerful tool for gene-set association analysis

被引:0
|
作者
Cao, Xueyuan [1 ]
Pounds, Stan [2 ]
机构
[1] Univ Tennessee, Hlth Sci Ctr, Dept Acute & Tertiary Care, Memphis, TN 38163 USA
[2] St Jude Childrens Res Hosp, Dept Biostat, 332 N Lauderdale St, Memphis, TN 38105 USA
关键词
Gene profiling; Gene set; Distance correlation; ACUTE MYELOID-LEUKEMIA; FALSE DISCOVERY RATE; FUNCTIONAL CATEGORIES; ENRICHMENT ANALYSIS; EXPRESSION; MICROARRAY;
D O I
10.1186/s12859-021-04110-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Identifying sets of related genes (gene sets) that are empirically associated with a treatment or phenotype often yields valuable biological insights. Several methods effectively identify gene sets in which individual genes have simple monotonic relationships with categorical, quantitative, or censored event-time variables. Some distance-based methods, such as distance correlations, may detect complex non-monotone associations of a gene-set with a quantitative variable that elude other methods. However, the distance correlations have yet to be generalized to associate gene-sets with categorical and censored event-time endpoints. Also, there is a need to determine which genes empirically drive the significance of an association of a gene set with an endpoint. Results: We develop gene-set distance analysis (GSDA) by generalizing distance correlations to evaluate the association of a gene set with categorical and censored event-time variables. We also develop a backward elimination procedure to identify a subset of genes that empirically drive significant associations. In simulation studies, GSDA more effectively identified complex non-monotone gene-set associations than did six other published methods. In the analysis of a pediatric acute myeloid leukemia (AML) data set, GSDA was the only method to discover that event-free survival (EFS) was associated with the 56-gene AML pathway gene-set, narrow that result down to 5 genes, and confirm the association of those 5 genes with EFS in a separate validation cohort. These results indicate that GSDA effectively identifies and characterizes complex non-monotonic gene-set associations that are missed by other methods. Conclusion: GSDA is a powerful and flexible method to detect gene-set association with categorical, quantitative, or censored event-time variables, especially to detect complex non-monotonic gene-set associations. Available at https://CRAN.R-project.org/package=GSDA..
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Genome-Wide Gene-Set Analysis Identifies Molecular Mechanisms Associated with ALS
    Vasilopoulou, Christina
    McDaid-McCloskey, Sarah L. L.
    McCluskey, Gavin
    Duguez, Stephanie
    Morris, Andrew P. P.
    Duddy, William
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (04)
  • [22] GSAE: an autoencoder with embedded gene-set nodes for genomics functional characterization
    Chen, Hung-I Harry
    Chiu, Yu-Chiao
    Zhang, Tinghe
    Zhang, Songyao
    Huang, Yufei
    Chen, Yidong
    BMC SYSTEMS BIOLOGY, 2018, 12
  • [23] Using Dynamic Mutation Rates in Gene-set Genetic Algorithms
    Hong, Tzung-Pei
    Wu, Min-Thai
    Lee, Yeong-Chyi
    IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010), 2010,
  • [24] GeneDecks: Paralog Hunting and Gene-Set Distillation with GeneCards Annotation
    Stelzer, Gil
    Inger, Aron
    Olender, Tsviya
    Iny-Stein, Tsippi
    Dalah, Irina
    Harel, Arye
    Safran, Marilyn
    Lancet, Doron
    OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2009, 13 (06) : 477 - 487
  • [25] Gene-set analysis is severely biased when applied to genome-wide methylation data
    Geeleher, Paul
    Hartnett, Lori
    Egan, Laurance J.
    Golden, Aaron
    Ali, Raja Affendi Raja
    Seoighe, Cathal
    BIOINFORMATICS, 2013, 29 (15) : 1851 - 1857
  • [26] miREM: an expectation-maximization approach for prioritizing miRNAs associated with gene-set
    Hadi, Luqman Hakim Abdul
    Lin, Quy Xiao Xuan
    Tri Tran Minh
    Loh, Marie
    Ng, Hong Kiat
    Salim, Agus
    Soong, Richie
    Benoukraf, Touati
    BMC BIOINFORMATICS, 2018, 19
  • [27] Geographically weighted linear combination test for gene-set analysis of a continuous spatial phenotype as applied to intratumor heterogeneity
    Amini, Payam
    Hajihosseini, Morteza
    Pyne, Saumyadipta
    Dinu, Irina
    FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2023, 11
  • [28] Cogena, a novel tool for co-expressed gene-set enrichment analysis, applied to drug repositioning and drug mode of action discovery
    Jia, Zhilong
    Liu, Ying
    Guan, Naiyang
    Bo, Xiaochen
    Luo, Zhigang
    Barnes, Michael R.
    BMC GENOMICS, 2016, 17
  • [29] A TWO-SAMPLE TEST FOR HIGH-DIMENSIONAL DATA WITH APPLICATIONS TO GENE-SET TESTING
    Chen, Song Xi
    Qin, Ying-Li
    ANNALS OF STATISTICS, 2010, 38 (02) : 808 - 835
  • [30] Delving into gene-set multiplex networks facilitated by a k-nearest neighbor-based measure of similarity
    Zheng, Cheng
    Wang, Man
    Yamada, Ryo
    Okada, Daigo
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 4988 - 5002