Gene-set distance analysis (GSDA): a powerful tool for gene-set association analysis

被引:0
|
作者
Cao, Xueyuan [1 ]
Pounds, Stan [2 ]
机构
[1] Univ Tennessee, Hlth Sci Ctr, Dept Acute & Tertiary Care, Memphis, TN 38163 USA
[2] St Jude Childrens Res Hosp, Dept Biostat, 332 N Lauderdale St, Memphis, TN 38105 USA
关键词
Gene profiling; Gene set; Distance correlation; ACUTE MYELOID-LEUKEMIA; FALSE DISCOVERY RATE; FUNCTIONAL CATEGORIES; ENRICHMENT ANALYSIS; EXPRESSION; MICROARRAY;
D O I
10.1186/s12859-021-04110-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Identifying sets of related genes (gene sets) that are empirically associated with a treatment or phenotype often yields valuable biological insights. Several methods effectively identify gene sets in which individual genes have simple monotonic relationships with categorical, quantitative, or censored event-time variables. Some distance-based methods, such as distance correlations, may detect complex non-monotone associations of a gene-set with a quantitative variable that elude other methods. However, the distance correlations have yet to be generalized to associate gene-sets with categorical and censored event-time endpoints. Also, there is a need to determine which genes empirically drive the significance of an association of a gene set with an endpoint. Results: We develop gene-set distance analysis (GSDA) by generalizing distance correlations to evaluate the association of a gene set with categorical and censored event-time variables. We also develop a backward elimination procedure to identify a subset of genes that empirically drive significant associations. In simulation studies, GSDA more effectively identified complex non-monotone gene-set associations than did six other published methods. In the analysis of a pediatric acute myeloid leukemia (AML) data set, GSDA was the only method to discover that event-free survival (EFS) was associated with the 56-gene AML pathway gene-set, narrow that result down to 5 genes, and confirm the association of those 5 genes with EFS in a separate validation cohort. These results indicate that GSDA effectively identifies and characterizes complex non-monotonic gene-set associations that are missed by other methods. Conclusion: GSDA is a powerful and flexible method to detect gene-set association with categorical, quantitative, or censored event-time variables, especially to detect complex non-monotonic gene-set associations. Available at https://CRAN.R-project.org/package=GSDA..
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Gene-set distance analysis (GSDA): a powerful tool for gene-set association analysis
    Xueyuan Cao
    Stan Pounds
    BMC Bioinformatics, 22
  • [2] Comparative evaluation of gene-set analysis methods
    Qi Liu
    Irina Dinu
    Adeniyi J Adewale
    John D Potter
    Yutaka Yasui
    BMC Bioinformatics, 8
  • [3] Gene-set activity toolbox (GAT): A platform for microarray-based cancer diagnosis using an integrative gene-set analysis approach
    Engchuan, Worrawat
    Meechai, Asawin
    Tongsima, Sissades
    Doungpan, Narumol
    Chan, Jonathan H.
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2016, 14 (04)
  • [4] GScluster: network-weighted gene-set clustering analysis
    Yoon, Sora
    Kim, Jinhwan
    Kim, Seon-Kyu
    Baik, Bukyung
    Chi, Sang-Mun
    Kim, Seon-Young
    Nam, Dougu
    BMC GENOMICS, 2019, 20 (1)
  • [5] MAGMA: Generalized Gene-Set Analysis of GWAS Data
    de Leeuw, Christiaan A.
    Mooij, Joris M.
    Heskes, Tom
    Posthuma, Danielle
    PLOS COMPUTATIONAL BIOLOGY, 2015, 11 (04)
  • [6] Network enrichment analysis: extension of gene-set enrichment analysis to gene networks
    Alexeyenko, Andrey
    Lee, Woojoo
    Pernemalm, Maria
    Guegan, Justin
    Dessen, Philippe
    Lazar, Vladimir
    Lehtio, Janne
    Pawitan, Yudi
    BMC BIOINFORMATICS, 2012, 13
  • [7] De-correlating expression in gene-set analysis
    Nam, Dougu
    BIOINFORMATICS, 2010, 26 (18) : i511 - i516
  • [8] Incorporating regulatory interactions into gene-set analyses for GWAS data: A controlled analysis with the MAGMA tool
    Groenewoud, David
    Shye, Avinoam
    Elkon, Ran
    PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (03)
  • [9] Hierarchical Gene-Set Genetic Algorithm
    Hong, Tzung-Pei
    Wu, Min-Thai
    JOURNAL OF COMPUTERS, 2008, 3 (11) : 67 - 75
  • [10] Effect of the absolute statistic on gene-sampling gene-set analysis methods
    Nam, Dougu
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2017, 26 (03) : 1248 - 1260