mulea: An R package for enrichment analysis using multiple ontologies and empirical false discovery rate

被引:0
作者
Turek, Cezary [1 ]
Olbei, Marton [1 ,2 ]
Stirling, Tamas [3 ,4 ,5 ]
Fekete, Gergely [3 ,4 ]
Tasnadi, Ervin [3 ,6 ]
Gul, Leila [1 ,2 ]
Bohar, Balazs [2 ,3 ,7 ]
Papp, Balazs [3 ,4 ]
Jurkowski, Wiktor [1 ]
Ari, Eszter [3 ,4 ,7 ]
机构
[1] Earlham Inst, Norwich Res Pk, Norwich NR4 7UZ, Norfolk, England
[2] Imperial Coll London, Hammersmith Hosp, Dept Metab Digest & Reprod, Commonwealth Bldg,Du Cane Rd, London W12 0NN, England
[3] HUN REN Biol Res Ctr, Inst Biochem, Synthet & Syst Biol Unit, Temesvar Krt 62, H-6726 Szeged, Hungary
[4] HCEMM BRC Metab Syst Biol Res Grp, Temesvar Krt 62, H-6726 Szeged, Hungary
[5] Univ Szeged, Doctoral Sch Biol, Kozep Fasor 52, H-6726 Szeged, Hungary
[6] Univ Szeged, Doctoral Sch Comp Sci, Arpad Ter 2, H-6720 Szeged, Hungary
[7] Eotvos Lorand Univ, Dept Genet, 1-C Pazmany P Stny, H-1117 Budapest, Hungary
来源
BMC BIOINFORMATICS | 2024年 / 25卷 / 01期
基金
英国生物技术与生命科学研究理事会;
关键词
Gene set enrichment; R package; False discovery rate; Overrepresentation analysis; GMT files; Ontologies; GENE SET ENRICHMENT; RESOURCE; DATABASE; LISTS;
D O I
10.1186/s12859-024-05948-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Traditional gene set enrichment analyses are typically limited to a few ontologies and do not account for the interdependence of gene sets or terms, resulting in overcorrected p-values. To address these challenges, we introduce mulea, an R package offering comprehensive overrepresentation and functional enrichment analysis. mulea employs a progressive empirical false discovery rate (eFDR) method, specifically designed for interconnected biological data, to accurately identify significant terms within diverse ontologies. mulea expands beyond traditional tools by incorporating a wide range of ontologies, encompassing Gene Ontology, pathways, regulatory elements, genomic locations, and protein domains. This flexibility enables researchers to tailor enrichment analysis to their specific questions, such as identifying enriched transcriptional regulators in gene expression data or overrepresented protein domains in protein sets. To facilitate seamless analysis, mulea provides gene sets (in standardised GMT format) for 27 model organisms, covering 22 ontology types from 16 databases and various identifiers resulting in almost 900 files. Additionally, the muleaData ExperimentData Bioconductor package simplifies access to these pre-defined ontologies. Finally, mulea's architecture allows for easy integration of user-defined ontologies, or GMT files from external sources (e.g., MSigDB or Enrichr), expanding its applicability across diverse research areas. mulea is distributed as a CRAN R package downloadable from https://cran.r-project.org/web/packages/mulea/ and https://github.com/ELTEbioinformatics/mulea. It offers researchers a powerful and flexible toolkit for functional enrichment analysis, addressing limitations of traditional tools with its progressive eFDR and by supporting a variety of ontologies. Overall, mulea fosters the exploration of diverse biological questions across various model organisms.
引用
收藏
页数:13
相关论文
共 42 条
  • [11] SignaLink3: a multi-layered resource to uncover tissue-specific signaling networks
    Csabai, Luca
    Fazekas, David
    Kadlecsik, Tamas
    Szalay-Beko, Mate
    Bohar, Balazs
    Madgwick, Matthew
    Modos, Dezso
    Olbei, Marton
    Gul, Lejla
    Sudhakar, Padhmanand
    Kubisch, Janos
    Oyeyemi, Oyebode James
    Liska, Orsolya
    Ari, Eszter
    Hotzi, Bernadette
    Billes, Viktor A.
    Molnar, Eszter
    Foldvari-Nagy, Laszlo
    Csalyi, Kitti
    Demeter, Amanda
    Papai, Nora
    Koltai, Mihaly
    Varga, Mate
    Lenti, Katalin
    Farkas, Illes J.
    Tuerei, Denes
    Csermely, Peter
    Vellai, Tibor
    Korcsmaros, Tamas
    [J]. NUCLEIC ACIDS RESEARCH, 2022, 50 (D1) : D701 - D709
  • [12] DAVID: Database for annotation, visualization, and integrated discovery
    Dennis, G
    Sherman, BT
    Hosack, DA
    Yang, J
    Gao, W
    Lane, HC
    Lempicki, RA
    [J]. GENOME BIOLOGY, 2003, 4 (09)
  • [13] Enrichr-KG: bridging enrichment analysis across multiple libraries
    Evangelista, John Erol
    Xie, Zhuorui
    Marino, Giacomo B.
    Nguyen, Nhi
    Clarke, Daniel J. B.
    Ma'ayan, Avi
    [J]. NUCLEIC ACIDS RESEARCH, 2023, 51 (W1) : W168 - W179
  • [14] Using GOstats to test gene lists for GO term association
    Falcon, S.
    Gentleman, R.
    [J]. BIOINFORMATICS, 2007, 23 (02) : 257 - 258
  • [15] Benchmark and integration of resources for the estimation of human transcription factor activities
    Garcia-Alonso, Luz
    Holland, Christian H.
    Ibrahim, Mahmoud M.
    Turei, Denes
    Saez-Rodriguez, Julio
    [J]. GENOME RESEARCH, 2019, 29 (08) : 1363 - 1375
  • [16] Gierlinski M., 2022, R package version 1.0.5
  • [17] Grote S, 2024, GOfuncR: Gene Ontology Enrichment using FUNC. R Package version 1
  • [18] TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions
    Han, Heonjong
    Cho, Jae-Won
    Lee, Sangyoung
    Yun, Ayoung
    Kim, Hyojin
    Bae, Dasom
    Yang, Sunmo
    Kim, Chan Yeong
    Lee, Muyoung
    Kim, Eunbeen
    Lee, Sungho
    Kang, Byunghee
    Jeong, Dabin
    Kim, Yaeji
    Jeon, Hyeon-Nae
    Jung, Haein
    Nam, Sunhwee
    Chung, Michael
    Kim, Jong-Hoon
    Lee, Insuk
    [J]. NUCLEIC ACIDS RESEARCH, 2018, 46 (D1) : D380 - D386
  • [19] Hastie T., 2009, The elements of statistical learning: data mining, inference, and prediction, DOI [DOI 10.1007/978-0-387-84858-7, 10.1007/978-0-387-84858-7]
  • [20] miRTarBase update 2022: an informative resource for experimentally validated miRNA-target interactions
    Huang, Hsi-Yuan
    Lin, Yang-Chi-Dung
    Cui, Shidong
    Huang, Yixian
    Tang, Yun
    Xu, Jiatong
    Bao, Jiayang
    Li, Yulin
    Wen, Jia
    Zuo, Huali
    Wang, Weijuan
    Li, Jing
    Ni, Jie
    Ruan, Yini
    Li, Liping
    Chen, Yidan
    Xie, Yueyang
    Zhu, Zihao
    Cai, Xiaoxuan
    Chen, Xinyi
    Yao, Lantian
    Chen, Yigang
    Luo, Yijun
    LuXu, Shupeng
    Luo, Mengqi
    Chiu, Chih-Min
    Ma, Kun
    Zhu, Lizhe
    Cheng, Gui-Juan
    Bai, Chen
    Chiang, Ying-Chih
    Wang, Liping
    Wei, Fengxiang
    Lee, Tzong-Yi
    Huang, Hsien-Da
    [J]. NUCLEIC ACIDS RESEARCH, 2022, 50 (D1) : D222 - D230