Gene-set distance analysis (GSDA): a powerful tool for gene-set association analysis

被引:0
作者
Cao, Xueyuan [1 ]
Pounds, Stan [2 ]
机构
[1] Univ Tennessee, Hlth Sci Ctr, Dept Acute & Tertiary Care, Memphis, TN 38163 USA
[2] St Jude Childrens Res Hosp, Dept Biostat, 332 N Lauderdale St, Memphis, TN 38105 USA
关键词
Gene profiling; Gene set; Distance correlation; ACUTE MYELOID-LEUKEMIA; FALSE DISCOVERY RATE; FUNCTIONAL CATEGORIES; ENRICHMENT ANALYSIS; EXPRESSION; MICROARRAY;
D O I
10.1186/s12859-021-04110-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Identifying sets of related genes (gene sets) that are empirically associated with a treatment or phenotype often yields valuable biological insights. Several methods effectively identify gene sets in which individual genes have simple monotonic relationships with categorical, quantitative, or censored event-time variables. Some distance-based methods, such as distance correlations, may detect complex non-monotone associations of a gene-set with a quantitative variable that elude other methods. However, the distance correlations have yet to be generalized to associate gene-sets with categorical and censored event-time endpoints. Also, there is a need to determine which genes empirically drive the significance of an association of a gene set with an endpoint. Results: We develop gene-set distance analysis (GSDA) by generalizing distance correlations to evaluate the association of a gene set with categorical and censored event-time variables. We also develop a backward elimination procedure to identify a subset of genes that empirically drive significant associations. In simulation studies, GSDA more effectively identified complex non-monotone gene-set associations than did six other published methods. In the analysis of a pediatric acute myeloid leukemia (AML) data set, GSDA was the only method to discover that event-free survival (EFS) was associated with the 56-gene AML pathway gene-set, narrow that result down to 5 genes, and confirm the association of those 5 genes with EFS in a separate validation cohort. These results indicate that GSDA effectively identifies and characterizes complex non-monotonic gene-set associations that are missed by other methods. Conclusion: GSDA is a powerful and flexible method to detect gene-set association with categorical, quantitative, or censored event-time variables, especially to detect complex non-monotonic gene-set associations. Available at https://CRAN.R-project.org/package=GSDA..
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Gene set meta-analysis with Quantitative Set Analysis for Gene Expression (QuSAGE)
    Meng, Hailong
    Yaari, Gur
    Bolen, Christopher R.
    Avey, Stefan
    Kleinstein, Steven H.
    PLOS COMPUTATIONAL BIOLOGY, 2019, 15 (04) : 1 - 10
  • [42] Risk gene-set and pathways in 22q11.2 deletion-related schizophrenia: a genealogical molecular approach
    Michaelovsky, Elena
    Carmel, Miri
    Frisch, Amos
    Salmon-Divon, Mali
    Pasmanik-Chor, Metsada
    Weizman, Abraham
    Gothelf, Doron
    TRANSLATIONAL PSYCHIATRY, 2019, 9 (1)
  • [43] Graphite Web: web tool for gene set analysis exploiting pathway topology
    Sales, Gabriele
    Calura, Enrica
    Martini, Paolo
    Romualdi, Chiara
    NUCLEIC ACIDS RESEARCH, 2013, 41 (W1) : W89 - W97
  • [44] ConceptGen: a gene set enrichment and gene set relation mapping tool
    Sartor, Maureen A.
    Mahavisno, Vasudeva
    Keshamouni, Venkateshwar G.
    Cavalcoli, James
    Wright, Zachary
    Karnovsky, Alla
    Kuick, Rork
    Jagadish, H. V.
    Mirel, Barbara
    Weymouth, Terry
    Athey, Brian
    Omenn, Gilbert S.
    BIOINFORMATICS, 2010, 26 (04) : 456 - 463
  • [45] Distance-correlation based gene set analysis in longitudinal studies
    Sun, Jiehuan
    Herazo-Maya, Jose D.
    Huang, Xiu
    Kaminski, Naftali
    Zhao, Hongyu
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2018, 17 (01)
  • [46] DBGSA: a novel method of distance-based gene set analysis
    Li, Jin
    Wang, Limei
    Xu, Liangde
    Zhang, Ruijie
    Huang, Meilin
    Wang, Ke
    Xu, Jiankai
    Lv, Hongchao
    Shang, Zhenwei
    Zhang, Mingming
    Jiang, Yongshuai
    Guo, Maozu
    Li, Xia
    JOURNAL OF HUMAN GENETICS, 2012, 57 (10) : 642 - 653
  • [47] Gene Set Analysis Using Spatial Statistics
    Riffo-Campos, Angela L.
    Ayala, Guillermo
    Montes, Francisco
    MATHEMATICS, 2021, 9 (05) : 1 - 13
  • [48] Sample Size and Reproducibility of Gene Set Analysis
    Maleki, Farhad
    Ovens, Katie
    McQuillan, Ian
    Kusalik, Anthony J.
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 122 - 129
  • [49] Gene set analysis methods: a systematic comparison
    Mathur, Ravi
    Rotroff, Daniel
    Ma, Jun
    Shojaie, Ali
    Motsinger-Reif, Alison
    BIODATA MINING, 2018, 11
  • [50] GSAASeqSP: A Toolset for Gene Set Association Analysis of RNA-Seq Data
    Xiong, Qing
    Mukherjee, Sayan
    Furey, Terrence S.
    SCIENTIFIC REPORTS, 2014, 4